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Description 

RflldQfthe Invention 

5 [0001 ] This invention relates to granulocyte colony stimulating factor ("Q-CSP) analogs. 
Background 

[0002] Hematopoiesis is controlled two systems: the cells within the k>one marrow microenvironment and growth 

10 factas The growth fectore. also caHed colony stimulating factore. stimulate committed progenitor cells to proliferate and 
to form colonies of differentiating blood cells. One of these factors is granulocyte colony stimulating tactor, herein called 
G-CSF, which preferentially stimulates the growth and development of neutrophils. Indicating a potential use in neutro- 
penic states. Welte et al.. PNAS-USA gg: 1526-1530 (1985); Souza et al.. Science 2£: 61-65 (1986) and QakNrilove. J. 
Seminars in Hematology 2g: (2) M4 (1989). 

IS [0003] In humans, endogenous G-CSF is detectak)le in t)lood plasma. Jones et al.. Bailliere*s Qinical Hematology 
2 (1 ): 83-1 1 1 (1 989). Q-CSF is produced by f ft>robla8ts. macrophages. T cells trophoblasts. expression product of a sin- 
gle copy gene comprised of four exons and five introns located on chromosome seventeen. Transcription of this locus 
produces a mRNA species which is differentially processed, resulting in two forms of Q-CSF mRNA. one version coding 
for a protein of 177 amino acids, the other coding for a protein of 174 amino acids. Nagata et al., EMBO J 5: 575-581 

20 (1 986). and the fam corrprised of 1 74 amino acids has been found to have the greatest specific in vivo biological activ- 
ity Q-CSF is species cross-reactive, such that when human Q-CSF is administered to another mammal such as a 
mouse, canine or monl^ey. sustained neutrophil leukocytosis is elicited. Moore et al.. PNAS-USA fil: 71 34-71 38 (l 987) . 
[0004] Human G-CSF can be obtained and purified from a number of sources, t^ral human Q-CSF (nhO-CSF) 
can be isolated from the supematants of cultured human tumor cell lines. The development of recombinant DNA tech- 

25 nology. see. fa instance, U.S. Patent 4,810,643 (Souza) incorporated herein by reference, has enabled the production 
of commercial scale quantities of G-CSF in glycosylated form as a product of eukaryotic host cell expression, and of G- 
CSF in non-glycosylated form as a product of prolwyotic host cell expression. 

[0005] G-CSF has been found to be useful in the treatment of indications where an increase in neutrophils will pro- 
vide benefits. Fbr exanple. for cancer patients. G-CSF Is beneficial as a means of selectively stimulating neutrophil pro- 
30 duction to compensate fbr hematopoietic deficits resulting from chenwtherapy or radiation therapy. Other indications 
include treatment of various infectious diseases and related conditions, such as sepsis, which is typically caused by a 
metabolite of bacteria. G-CSF is also useful alone, or in combination with other compounds, such as other cytokines, 
for growth or expansion of ceils in culture, for example, fbr bone marrow transplants. 

[0006] Signal transduction, the way In which G-CSF effects cellular metabolism. Is not cunrently thoroughly under- 
35 stood. G-CSF binds to a cell-surface receptor which apparently initiates the changes within particular progenitor ceils, 
leading to cell differentiation. 

[0007] Various altered G-CSPs have been reported. Generally for design of drugs, certain changes are known to 
have certain structural effects. For example, deleting one cysteine could result in the unfolding of a molecule which is, 
In its unaltered state, is normally folded via a disulfide bridge. There are other known methods fbr adding, deleting or 
40 substituting amino acids in order to change the function of a protein. 

[0008] Reoon«)inant human G-CSF mutants have been prepared, but the method of preparatton does not include 
overall structure/function relationship information. For example, the mutation and biochemical modification of Cys 18 
has been reported. Kuga et al., Biochem. Biophy Res. Comm 159: 103-1 1 1 (1989): Lu et al.. Arch. Biochem. Biophys. 
26a: 81-92 (1989). 

45 [0009] In U.S. Patent No. 4. 810. 643, entitled. "Production of Pluripotent Granulocyte Colony-Stimulating Factor" 
(as cited above), polypeptide analogs and peptide fragments of G-CSF are disclosed generally Specific G-CSF ana- 
logs disclosed include those with the cysteins at positions 17, 36. 42, 64, and 74 (of the 174 amino ackl species or of 
those having 175 amino acids, the additional amino acid being an N-terminal methionine) substituted with another 
amino add, (such as serine), and Q-CSF with an alanine in the first (N-terminal) position. 

so [0010] EP 0 335 423 entitled "Modified human G-CSP reportedly discioseu tiie modification of at least one amino 
group in a polypeptide having hG-CSF activity. 

[0011] EP 0 272 703 entitled "Novel Polypeptide" reportedly discloses G-CSF derivatives having an amino acid 
substituted or deleted at or "in tiie neighborhood" f tiie N terminus. 

[0012] EP 0 459 630, entrtied "Polypeptides" reportedly discloses derivatives of naturally occumng G-CSF having 
55 at least one of ttie biological properties of naturally occurring G-CSF and a solution stability of at least 35% at 5 mg/ml 
in which the derivative has at least Cys^^ of the native sequence replaced by a Ser^^ residue and Asp^^ of the native 
sequence replaced by a Ser^^ residue. 

[001 3] EP 0 256 843 entitled "Expression of G-CSF and Muteins Thereof and Their Uses" reportedly discloses a 
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modified DNA sequence encoding Q-CSF wherein the N-terminus is nwdified for enhanced expression of protein In 
recombinant host cells, without changing the amino add sequence of the protein. 

[00141 EP 0 243 153 entitled 'Human Q-CSF Protein Expression" reportedly discloses Q-CSF to be nwdified by 
inactivating at least one yeast KEX2 protease processing site for inaeased yield in recombinant production using yeast. 
5 [0015] Shaw. U S. Patent Na 4.904.584. wititled "Site-Specific Homogeneous Modification of Polypeptides." 
reportedly discloses lysine altered proteins. 

[001 6] WO/9012874 reportedly discloses cysteine altered variants of proteins. 

[0017] Australian patent application Document r^. AU-A'10948/92. entitled, "Improved Activation of Recombinant 
Proteins" reportedly discloses the addition of amino acids to either terminus of a Q-CSF molecule tor tiie purpose of 

10 aiding in the folding of the molecule after proKaryotic e)^es8ion. 

[0018] Australian patent application Document No. AU-A-7638Q/91 . entitied. "Mutelns of the Qranulocyte Colony 
Stimulating Factor (Q-CSF)" reportedly discloses muteins of ttie granulocyte stimulating factor Q-CSF in the sequence 
Leu-Qly-His-Ser-Leu-Qty-lle at position 50-56 of Q-CSF witti 174 amino acids, and position 53 to 59 of the Q-CSF witii 
177 amino acids, or/and at least one of ttie tour histadine residues at positions 43. 79. 156 and 170 of the mature Q- 

15 CSF witti 1 74 amino acids or at positions 46. 82. 159. or 173 of the mature Q-CSF with 1 77 amino acids. 

[0019] QB 2 213 821, entitied "Synthetic Human Qranulocyte Cotony Stimulating Fteta Qene" reportedly discloses 
a syntfietic Q-CSF-encoding nucleic add sequence incorporating restriction sites to fadlitate the cassette mutagenesis 
of selected regions, and flanking restriction sites to fadlitate the incorpaation of the gene into a desired expression sys- 
tem. 

20 [0020] Q-CSF has reportedly been crystallized to some extent e.g.. EP 344 796. and the overall structure of Q- 
CSF has been surmised, but only on a gross level. Bazan. Immundogy Today 11: 350-354 (1990); Parry et al.. J. Molec- 
ular Recognition fi: 107-1 10 (1988). To date, there have been no reports of the overall structure of Q-CSF, and no sys- 
tematic studies of the relationship of the overall structure and function of the mdecule, studies which are essential to 
the systematic design of Q-CSF analogs. Accordingly, ttiere exists a need tor a method of this systematic design of Q- 

25 CSF analogs, and the resultant compositions. 

Siimrnarygfthe Invention 

[0021] The tiiree (£mensional structure of Q-CSF has now been determined to the atomic level. From this ttiree- 
30 dimensional structure, one can now forecast with substantial certainty how changes in the composition of a Q-CSF mol- 
ecule may result in structural changes. These structural characteristics may be correlated with biological activity to 
design and produce Q-CSF analogs. 

[0022] AKhough others had speculated regarding the ttiree dimensional structure of Q-CSF, Bazan. Immunology 
Today 11: 350-354 (1990); Parry et al.. J. Molecular Recognition fi: 107-1 10 (1988). ttiese speculations were of no help 

35 to tiiose wishing to prepare Q-CSF analogs eitiier because the surmised structure was tncon-ect (Pary et al.. SUQIB) 
and/or because the surmised structure provided no detail con^elating the constituent moieties with structure. The 
present determination of ttie three-dimensional structure to ttie atomic level is by far the most complete analysis to date, 
and pr vides important Intormation to those wishing to design and prepare Q-CSF analogs. For example, from ttie 
present ttiree dimensional structural analysts, precise areas of hydrophobidty and hydrophilicity have been determined. 

40 [0023] Relative hydrophobicity is important because it directiy relates to the stability of the molecule. Qenerally. bio- 
logical molecules, found in aqueous environments, are externally hydrophilic and internally hydrophobic; in accordance 
witti the second law of thermodynamics provides, tills is the lowest energy state and provides tor stability. Although one 
could have speculated that Q-CSF's internal core would be hydrophobic, and the outer areas would be hydrophilic, one 
would have had no way of knowing specific hydrophobic or hydrophilic areas. Witti tiie presently provided Knowledge of 

45 areas of hydrophobicity^hilicity. one may forecast witii substantial certainty which changes to the Q-CSF molecule will 
affect the overall structure of the nwlecule. 

[0024] As a general rule, one may use knowledge of ttie geography of ttie hydrophobic and hydrophilic regions to 
design analogs in which the overall Q-CSF structure is not changed, but change does affect biological activity ("biolog- 
ical activity" being used here in its broadest sense to denote function). One may correlate biological activity to structure. 
50 If the structure is not changed, and the mutation has no effect on biological activity, then the mutation has no biological 
function. If. however, tiie structure is not changed and ttie mutation does affect biological activity, ttien the residue (or 
atom) is essential to at least one biological function. Some of the present working examples were designed to provkje 
no change in overall structure, yet have a change in biological function. 

[0025] Based on the correlation of stixicture to biological activity, the present invention relates to Q-CSF analogs. 
55 These analogs are molecules which have more, fewer, different or modified amino acid residues from the Q-CSF amino 
acid sequence. The modifications may be by addition, substitution, or deletion of one or more amino acid residues. The 
modification may include the addition or sufc)stitution of analogs of the amino acids ttiemsetves, such as peptidomimet- 
ics or amino adds with altered moieties such as altered skJe groups. The Q-CSF used as a basis tor comparison may 
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be of human, animal or recombinant nucleic acid-technology origin (although the working exanrplee disclosed herein 
are based on the recombinant production of the 1 74 amino acid species of hunr^an G-CSF, having an extra N-terminus 
methionyl residue). The analogs may possess functions different from natural human 0-CSF nwlecule, or may exhibit 
the same functions, or varying degrees of the same functions. Fa example, the analogs may be designed to have a 

5 higher or lower biological activity, have a longer shelf-life or a decrease in stability, be easier to formulate, or more diffi- 
cult to con*lne with other ingredients. The analogs may have no henwitopoietlc activity, and may therefore be useful as 
an antagonist against G-CSF effect (as, for exanrple, in the overproduction of G-CSF). From time to time herein the 
present analogs are refen'ed to as proteins or peptides for convenience, but contemplated herein are other types off nrx>l- 
ecules. such as peptidomimetics or chemically modified peptides. 

10 [0026] In another aspect, the present disclosure relates to related compositions containing a G-CSF analog as an 
active ingredient. The term, "related composition," as used herein, is meant to denote a composition which may be 
obtained once the identity of the Q-CSF analog Is ascertained (such as a G-CSF analog labeled with adetectable label, 
related receptor or phannaceutical composition). Also considered a related composition are chemically nwdified ver- 
sions of the G-CSF analog, such as those having attached at least one polyethylene glycol molecule. 

75 [0027] Fbr example, one may prepare a G-CSF analog to which a detectable label is attached, such as a fluores- 
cent, chemiluminescent or radioactive molecula 

[0028] Another example is a pharmaceutical composition which may be formulated by known techniques using 
Known materials, fififi. fi^.. Bennington's Pharnnaceutical Sciences, 18th Ed. (1990, Mack Publishing Co.. Easton, Penn- 
sylvania 18042) pages 1435-1712. which are herein incorporated by reference. Generally, the formulation will depend 
20 on a variety of factors such as administration, stability, productfon concerns and other factors. The G-CSF analog may 
be administered by injection or by pulmonary admini8tratk)n via inhatatfon. Enteric dosage forms may also be available 
for the present Q-CSF analog compositions, and therefore oral adnrtinistration may be effective. G-CSF analogs may be 
inserted into liposomes or other mlcrocanriers for delivery, and may be formulated in gels or other composittons for sus- 
tained release. Although preferred compositions will vary depending on the use to which the conrposition will be put, 
25 generally, for G-CSF analogs having at least one of the biological activities of natural G-CSF. preferred pharmaceutical 
conpositions are those prepared for subcutaneous injection or for pulmonary administration via inhalation, although the 
particular formulations for each type of administration will depend on the characteristics of the analog. 
[0029] Another example of related composition is a receptor for the present analog. As used herein, the term 
"receptor" indicates a moiety which selectively binds to the present analog nnlecule. For example, antibodies, or frag- 
so ments thereof, or "recont)inant antibodies" (aafi Huse et al.. Science aifi:1275 (1989)) may be used as receptors. 
Selective binding does not mean only specific binding (although binding-specific receptors are encompassed herein), 
but rather that the binding is not a random event Receptors may be on the cell surface or intra- or extra-cellular, and 
may act to effectuate, inhibit or localize the bfological activity of the present analogs. Receptor binding may also be a 
triggering mechanism for a cascade of activity indirectly related to the analog itself. Also contemplated herein are 
35 nudeic acids, vectors containing such nucleic acids arvi host cells containing such nucleic adds which encode such 
receptors. 

[0030] Another example of a related composition is a G-CSF analog with a chemical moiety attached. Generally, 
chemical modification nwy alter biological activity or antigenidty of a protein, or may alter other characteristics, and 
these factors will be taken into account by a skilled practitioner. As noted above, one example of such chemical moiety 

<o is polyethylene glycol. Modif ication may indude the addition of one or woiB hydrophilic or hydrophobic polymer mole- 
cules, fatty ack) molecules, or polysaccharide molecules. Examples of chemical modifiers indude polyethylene glycol. 
alWpolyethylene glycols, DI-poly(amino ackJs), polyvinylpyrrolidone, polyvinyl alcohol, pyran copolymer, acetic 
ackl/acylation. proprionic acid, palmitic ackj. stearic ackl, dextran. carbaxymethyl cellulose, pullulan, or agarose. Sfi& 
Francis. Focus on Growth Factors i: 4-10 (May 1992) (published by Mediscript Mourtview Court. Friern Barnet Une, 

4$ London N20 OLD. UK). Also, chemical modification may indude an additional protein or portion thereof, use of a cyto- 
toxic agent, or an antibody The chemical modification may also Include ledthin. 

[0031 ] In another aspect, the present disdosure relates to nudeic adds encoding such analogs. The nucleic ackls 
may be DMAs or RNAs or derivatives thereof, and will typically be doned and expressed on a vector, such as a phage 
or piasmkj containing appropriate regulatory sequences. The nucleic adds may be labeled (such as using a radioactive. 
so chemiluminescent. or f luor^ent label) for diagnostic or prognostic purposes, for example. The nudeic acid sequence 
may be optimized for expressfon, such as induding codons preferred fbr bacterial expression. The nucleic acid and its 
complementary strand, and modifications thereof which do not prevent enooooding of the desired analog are here con- 
templated. 

[0032] In another aspect, the present disclosure relates to host cells containing the above nudeic ackis encoding 
55 the present analogs. Host cells may be eukaryotic or prokaryotic. and expression systems may include extra steps relat- 
ing to the attachment (or prevention) of sugar groups (glycosyiation). proper fokiing of the molecule, the addition or 
deletion of leader sequences or other factors incdent to recombinant expression. 

[0033] In another aspect the present disdosure relates to antisense nudeic adds which act to prevent or nKSdify the 
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type or amount of expression of euch nucleic acid sequences. These may be prepared by Irown methods. 
[0034] In another aspect of the present disclosure, the nudeic adds encoding a present analog may be used for 
gene therapy purposes, tor example, by pladng a vector containing the analog-encoding sequence into a recipient so 
the nudeic acid itself is eiqpressed inside the rec^ent who is in need of the analog composition. The vector may first 
5 be placed in a carrier, such as a cell, and then the carrier placed into the recipient. Such expression may be localized 
or systemic. Other carriers indude non-naturally occuning carriers, such as liposomes or other mlcrocarriers or parti- 
des. which may act to mediate gene transfer into a recipient 

[0035] The present disdosure also provides for computer programs for the expression (such as visual display) of 
the Q-CSF or analog three dimensional structure, and further, a computer program which expresses the identity of each 

10 constituent of a Q-CSF molecule and the precise location within the overall structure of that constituent, down to the 
atomic level. Set forth below Is one example of such program. There are many currently available computer programs 
for the expression of the three dimensfonal structure of a molecule. Generally, these programs provide tor inputting of 
the coordinates for the three dimensional structure of a molecule (i.e.. for example, a numerical assignment tor each 
atom of a Q-CSF molecule along an x. y, and z axis), means to express (such as visually display) such coordinates. 

IS means to alter such coordinates and means to express an image of a molecule having such altered coordinates. One 
may program crystallographic intonration. i.a. the coordinates of the location of the atons of a Q-CSF molecule in 
three dimension space, wherein such coordinates have been obtained from crystailographtc analysis of said Q-CSF 
molecule, into such programs to generate a computer program tor the expresston (such as visual display) of the Q-CSF 
three dimensional structure. Also provided, therefore, is a cortputer program for the expression of Q-CSF analog three 

20 dimensional structure. Preferred is the computer program Insight II. version 4. available from Biosym, San Diego, Cali- 
fornia, with the coordinates as set forth in FIQURE 5 input. Prefen-ed expresston means is on a Silicon Graphics 320 
VC3X cortputer, with Crystal Eyes glasses (also available from Silicon Graphics), which allows one to view the Q-CSF 
molecule or its analog stereoscopicatly. Alternatively, the present Q-CSF crystallographic cooidinates and diffraction 
data are also deposited in the Protein Data Bank. Chemistry Department, Brookhaven National Laboratory. Upton. New 

25 York 1 19723, USA, One may use these data in preparing a different computer program tor expression of the three 
dimensional structure of a Q-CSF rrwlecule or analog thereof. Therefore, another aspect of the present invention is a 
computer program for the expression of the three dinrtensional structure of a Q-CSF molecule. Also provided is said 
computer program for visual display of the three dmenstonal structure of a Q-CSF molecule: and further, said program 
having means for altering such visual display. Apparatus useful for expresston of such computer program, particularly 

30 for the visual display of the computer image of said three dimensional structure of a Q-CSF molecule or analog thereof 
Is also therefore here provided, as welt as means for preparing said computer program and apparatus. 
[0036] The computer program is useful for preparation of Q-CSF analogs because one may select specific sites on 
the G-CSF molecule for alteration and readily ascertain the effect the alteration will have on the overall structure of the 
Q-CSF molecule. Selection of saM site for alteratton will depend on the desired bidogical characteristic of the Q-CSF 

35 analog. If one were to randomly change sakJ Q-CSF molecule (r-met-hu-Q-CSF) there wouto be 1 75^ possible substi- 
tutions, and even more analogs having multiple changes, additions or delettons. By viewing the three dimensional struc- 
ture wherein said structure is correlated with the composition of the molecule, the setectton for sites of alteration is no 
longer a random event, but sites for alteration may be determined rationally. 

[0037] As set forth above, identity of the three dimensional structure of Q-CSF. induding the placement of each 

40 constituent down to the atomic level has now yieMed intormation regarding which mdeties are necessary to maintain 
the overall structure of the Q-CSF rTx>lecule. One may therefore select whether to maintain the overall structure of the 
G-CSF molecule when preparing a Q-CSF analog of the present invention, or whether (and how) to change the overall 
structure of the G-CSF molecule when preparing a Q-CSF analog of the present Inventton. Optfonally. once one has 
prepared such analog, one may test such analog for a desired characteristic. 

45 [0038] One may, for example, seek to maintain the overall structure possessed by a non-altered natural or recom- 
binant G-CSF mdecule. The overall structure is presented in Figures 2. 3. and 4, and is described in more detail below. 
Maintenance of the overall structure may ensure receptor binding, a necessary characteristk: fbr an analog possessing 
the hematopoietic capabilities of natural Q-CSF (if no receptor binding, signal transduction does not result from the 
presence of the analog). It is contemplated that one dass of Q-CSF analogs will possess the three dimensional core 

so structure of a natural or recombinant (non-altered) G-CSF molecule, yet possess different characteristics, such as an 
increased ability to selectively stimulate neutrophils. Another class of G-CSF analogs are those with a different overall 
structure which diminishes the ability of a G-CSF analog rTX}lecule to bind to a G-CSF recepta. and possesses a dimin- 
ished ability to selectively stimulate neutrophils as compared to non-altered natural or recombinant G-CSF 
[0039] For example, it is now known which moieties within the intemal regtons of the Q-CSF molecule are hydro- 

55 phobic, and, con'espondingly, which rrxjieties on the external porti n of the G-CSF molecule are hydrophilic. Without 
knowledge of the overall three dimensional structure, preferak)(y to the atomic le^el as provided herein, one coukj not 
forecast which alterations within this hydrophobic intemal area would result in a change in the overall structural confor- 
mation of the molecule. An overall structural change could result in a functional change, such as lack of receptor bind- 
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ing. for example, and therefore, cfiminishment of biological activity ae found in non-altered G-CSF. Another dass of G- 
CSF analogs is therefore Q-CSF analogs which possess the same hydrophobidty as (non-altered) natural or recom- 
binant G-CSF Mae particularly, another dass of Q-CSF analogs possesses the same hydrophobic moieties within the 
four helical bundle of its internal core as those hydrophobic moieties possessed by (non-altered) natural or recombinant 
G-CSF yet have a composition different from said non-altered natural or recombinant G-CSF 
[0040] Another example relates to external bops which are structures which connect the internal core (helices) of 
the G-CSF molecule. From the three dimensional structure - induding information regarding the spatial location of the 
amino acid residues - one may forecast that certain changes in certain loops will not result in overall conformational 
changes. Tho'efore. another dass of Q-CSF analogs provided herein is that having an altered external loop but pos- 
sessing the same overall structure as (non-altered) natural or recombinant G-CSF More particularly, another class of 
G-CSF analogs provided herein are those having an altered external loop, said loop being selected from the loop 
present between helices A and B; between helices B and C; between helices C and D; between helices D and A, as 
those loops and helices are identified herein. More particularly, said loops, preferably the AB loop and/or the CD loop 
are aftered to increase the half life of the molecule by stabilizing said loops. Such stabilization may be by connecting all 
or a portion of said foop(s) to a portion of an alpha helical bundle found in the core of a Q-CSF (or analog) mdecule. 
Such connection may be via beta sheet, salt bridge, disulfide bonds, hydrophobic interaction or other connecting means 
available to tiiose skilled in the art. wherein such connecting means serves to stabilize said external loop or loops. For 
example, one may stabilize the AB a CD loops by connecting ttie AB loop to one of the helices within the internal region 
of tiie molecule. 

[0041 ] The N-tenninus also may be altered without change In the overall structure of a Q-CSF nx)lecule. because 
the N-terminus does not effect structural stability of the internal helices, and, although the external loops are preferred 
for modification, the same general statements apply to the N-tenninus. 

[0042] Additionally, such extemai loops may be the 8ite(6) for chemical modification because in (non-altered) natu- 
ral or recombinant G-CSF such loops are relatively flexible and tend not to interfere with receptor binding. Thus, tiiere 
would be additional room for a chemical moiety to be directly attached (or indirectiy attached via another chemical moi- 
ety which serves as a chemical connecting means). The chemical moiety may be selected from a variety of moieties 
available for modification of one or more function of a Q-CSF molecule. For example, an external loop may provide sites 
for the addition of one or more polymer which serves to increase serum half-life, such as a polyethylene glycol molecule. 
Such polyethylene glycol molecule(s) may be added wherein said loop is altered to indude additional lysines which 
have reactive side groups to which polyethylene glycol moieties are capable of attaching. Other classes of chemical 
mdeties may also be attached to one or more external loops, including but not limited to other biologically active mole- 
cules, such as receptors, other therapeutic proteins (such as other hematopoietic factors which would engender a 
hybrid mdecule), or cytotoxic agents (such as diphtheria toxin). This list is of course not complete: one skilled in the art 
possessed of the desired chemical moiety will have the means to effed attachment of saki desired moiety to the desired 
external loop. Therefore, anotiier class of the present G-CSF analogs indudes those with at least one alteration in an 
external loop wherein said alteration provkjes for the addition of a chemical moiety such as at least one polyethylene 
glycol molecule. 

[0043] Deletions, such as deletions of sites recognized by proteins for degradation of the molecule, may also be 
effectual in the extemai loops. This provkdes attemative means for increasing half-lifo of a molecule othenfvise having 
the G-CSF receptor binding and signal transduction capabilities (i.e., the ability to selectively stimulate the maturation 
of neutrophils). Therefore, another dass of tiie present G-CSF analogs includes those with at least one alteration in an 
external loop wherein sad alteration decreases the turnover of saki analog by proteases. Preferred loops for such alter- 
ations are the AB loop and ttie CD foop. One may prepare an abbreviated G-CSF molecule by deleting a portion of tiie 
amino ackj residues found in ttie external loops (klentif led in more detail below), saM abbreviated G-CSF mdecule may 
have additional advantages in preparation or in biological function. 

[0044] Another example relates to tiie relative charges between amino ackl reskjues which are in proximity to each 
other. As noted above, the G-CSF molecule contains a relatively tightly packed four helical bundle. Some of the faces 
on tiie helices face ottier helices. At tiie point (such as a residue) where a helix faces anotiier helix, tiie two amino acid 
mdeties which face each ottier may have ttie same charge, and ttius tend to repel each ottier, which lends instability to 
the overall molecule. This may be eliminated by changing ttie charge (to an opposite charge or a neutral charge) of one 
or botii of the amino acki moieties so that there Is no repelling. Therefore, another dass of G-CSF analogs indudes 
those G-CSF analogs having been altered to modify instability due to surface interactions, such as electton charge 
location. 

[0045] The present invention relates to methods for designing G-CSF analogs and related compositions and tiie 
products f those mettiods. The end products of the mettiods may be ttie G-CSF analogs as defined above or related 
compositions. For instance, the examples disdosed herein demonstrate (a) the effects of changes in the constituents 
(i.e., chemical moieties) of tiie G-CSF molecule on ttie G-CSF structure and (b) the effects of changes in structure on 
biological function. Essentially, ttierefore, an asped of the present invention is a mettiod for preparing a G-CSF analog 
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comprising the steps of: 

(a) viewing at an amino add or atomic level intomiation conveying the three dimensional structure of a G-CSF mol- 
ecule as set forth in Rgure 5 wherein the chemical moieties, such as each arrano add residue or each atom of each 

5 amino acid residue, of the Q*CSF molecule are correlated with said structure; 

(b) selecting from said information a site on a Q-CSF molecule for alteration; 

(c) preparing a 0-CSF analog nrx)lecule having such alteration; and 

(d) optionally, testing such Q-CSF analog molecule for a desired characteristic. 

10 [0046] One may use the here provided computer programs for a computer-k)ased method for preparing a Q-CSF 
analog. Another aspect of the present invention is therefore a method for preparing a Q-CSF analog according to the 
method of the preceeding paragraph based on the use of a computer comprising the steps of: 

(a) providing computer expression of the three dimensional structure of a Q-CSF molecule wherein the chemical 
IS moieties, such as each amino add residue or each atom of each amino add residue, of the Q-CSF molecule are 

con'elated with said structure; 

(b) selecting from said computer expression a site on a G-CSF molecule fbr alteration; 

(c) preparing a Q-CSF molecule having such alteration; and 

(cQ optionally, testing such Q-CSF molecule tor a desired characteristic. 

20 

[0047] More specifically, the present invention provides a method fbr preparing a Q-CSF analog comprising the 
steps of: 

(a) viewing at the amino add or atomic level the three cfimensional ^ructure of a Q-CSF molecule as set forth in 
2$ Figure 5 via a computer, said computer programmed (i) to esq^ress the coordinates of a Q-CSF molecule in three 

dimensional space, and (ii) to allow fbr entry of infbrmatton for alteration of said Q-CSF expression and viewing 
thereof; 

(b) selecting a site on said visual image of said Q-CSF molecule fbr alteration; 

(c) entering information for said alteration on said conputer; 

30 (cO viewing a three dimensional structure of said altered Q-CSF molecule via said computer; 

(e) optionally repeating steps (a)-(e); 

(f) preparing a Q-CSF analog with said alteration; and 

(g) optionally testing said Q-CSF analog for a desired characteristic. 

35 [0048] In another aspect the present disdosure relates to methods of using the present Q-CSF analogs and 
related compositions and methods for the treatment or protection of mammals, either alone or in combination with other 
hen^topoietic factors or drugs in the treatment of hematopoietic disorders. It is contemplated tiiat one aspect of design- 
ing G-CSF analogs will be the goal of enhandng a modifying the characteristics non-modified Q-CSF is known to have. 
[0049] Fbr example, the analogs may possess enhanced or modified activities, so. where Q-CSF is useful in the 

40 treatment of (for example) neutropenia, the present compositions and methods may also be of such use. 

[0050] Another example is tiie modification of Q-CSF for tiie purpose of Interacting more effectively when used in 
combination with other factors particulariy in the treatment of hematopoietic disorders. One exanrple of such combina- 
tion use is to use an early-acting hematopoietic factor (i.e., a factor which acts eariier in the hematopoiesis cascade on 
relatively undifferentiated cells) and eittier simultaneously or in seriatim use of a later-acting hematopdetic factor, such 

45 as Q-CSF or analog thereof (as Q-CSF acts on the CFU-QM lineage in the selective stimulation of neutrophils). The 
methods and compositions may be useful in therapy involving such conrt)inations or 'cocktails' of hematopoietic factors. 
[0051 ] The conrpositions and methods may also be useful in the treatment of leukopenia, mylogenous leukemia, 
severe chronic neutropenia, aplastic anemia, glycogen storage disease, nujoosistitis, and other bone marrow failure 
states. The compositions and methods may also be useful in the treatment of hematopoietic def idts arising from chem- 

so otiierapy or from radiation ttierapy. The success of bone man^ow transplantation, or the use of peripheral blood progen- 
itor cells for transplantation, for example, may be enhanced by application of the present compositions (proteins or 
nudeic acids for gene therapy) and mettiods. The conrposrtions and methods may also be useful in the treatment of 
Infectious diseases, such in the context of wound healing, burn treatment, bacteremia, septicemia, fungal infections, 
endocarditis, osteopyelitis, infection related to abdominal trauma, infections not responding to antibiotics, pneumonia 

55 and the treatment of bacterial inflammation may also benefit from the application of the compositions and methods. In 
addition, the compositions and methods may be useful in the treatment of leukemia based upon a reported ability to 
differentiate leukemic cells. Welte et al.. PNAS-USA 1526-1530 (1985). Other applications indude the treatment of 
individuals with tumors, using the conpositions and metiiods, optionally in the presence of receptors (such as antibod- 
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ies) which bind to the tumor cells. For review articles on therapeutic applications, sfifi Ueshhke and Burgess. 
N.EnglJ.Med. 322: 28-34 and 99*106 (1992) both of which are herein inoorpaated by reference. 
[0052] The compositions and methods may also be useful to act as intennedlaries In the production of other moie- 
ties; for exarrple. Q-CSF has been reported to Influence the production of ottier hematopoietic factors and tiiis function 

5 (If ascertained) may be enhanced a modified via the present compositions and/or methods. 

[0053] The corrpositions related to the present G-CSF analogs, such as receptors, rr\ay be useful to act as an 
antagonist which prevents the activity of G-CSF or an analog. One may obtain a composition witii some or all of tiie 
activity of non-altered Q-CSF or a G-CSF analog, and add one or more chemical moieties to alter one or more proper- 
ties of such G-CSF or analog. With knowledge of the three dimensional conformation, one may forecast the best geo- 

10 graphic location for such chemical modification to achieve the desired effect. 

[0054] General objectives in chemical modification may include improved half-life (such as reduced renal, immuno- 
logical or cellular clearance), altered bioactivity (such as altered enzymatic properties, dissociated bioactivitles or activ- 
ity in organic solvents), reduced toxicity (such as concealing toxic epitopes, compartmentalization. and selective 
biodistribution). altered immunoreactivity (reduced immunogenicity. reduced antigenicity or acfuvant action), or altered 

1$ physical properties (such as increased solubility, improved thermal stability, improved mechanical stability, or conforma- 
tional stabilization). Sfifi Francis, Focus on Growth Factors i' 4-10 (May 1992) (published by Mediscript Mountview 
Court. Friern Barnel Une. London N20 OLD, UK). 

[0055] The examples below are illustrative of the present invention and are not intended as a limitation. It is under- 
stood that variations and nrxxllfications will occur to those skilled in the art. and it is intended ttiat the appended claims 
20 cover all such equivalent variations which come within the scope of the invention as claimed. 

Detailed Description of the Drawings 
[0056] 

2S 

FIGURE 1 is an illustration of the amino add sequence of the 1 74 amino acid species of G-CSF with an additional 
N-temninal methnnine (Seq. ID No.: 1) (Seq. ID Na: 2). 

FIGURE 2 is an topology diagram of the crystalline structure of Q-CSF. as well as hQH. pGH, QM-CSF. INF-B. IL- 
2, and IL-4. These illustrations are based on inspection of dted references. The length of secondary structural ele- 

30 ments are drawn in proportion to the number of residues. A. B. C. and D helices are labeled according to tiie 
scheme used herein for Q-CSF For INF-p, the original labeling of helices is indicated in paremheses. FIGURE 3 is 
an "ribbon diagram" of ttie three dimensional structure of G-CSF Helix A is amino acid residues 1 1-39 (numbered 
according to Figure 1, above), helix B is amino add residues 72-91. helix C is amino acid residues 100-123, and 
helix D is amino acid residues 1 43-1 73. The relatively short 3^^ helix is at amino acid residues 45-48. and the alpha 

35 helix is at amino acid residues 48-53. Residues 93-95 form almost one turn of a left handed helix. 

FIGURE 4 is a "tarrel diagram" of the three dimensional structure of G-CSF Shown in various shades of gray are 
the overall cylinders and their orientations for the three dimensional structure of Q-CSF The nunfi)ers Indicate 
amino add residue position according to FIGURE 1 above. 

FIGURE 5 is a list of the coordinates used to generate a computer-aided visual image of the tiiree<limensional 
40 Structure of G-CSF The coordinates are set forth below. The columns correspond to separate f iekJ: 

(i) Field 1 (from tiie left hand side) is tine atom. 
(iO Field 2 is the assigned atom number. 

(ill) Field 3 is ttie atom name (according to tiie periodic table standard nomenclature, with CB being carbon 
45 atom Beta, CG is Cartx)n atom Gamma, etc.); 

(iv) Field 4 is the residue type (according to three letter nomenclature fbr amino adds as found in. &g., Stryer. 
Biochemistry, 3d Ed., W.H. Freeman and Company, N.Y 1988, inside back cover); 

(v) Fields 5-7 are tiie x-axis. y-cuds and z-axis positions of the atom; 

(vi) Field 8 (often a "1 .00") designates occupancy at tiiat position; 
so (vii) Field 9 designates tiie B-factor; 

(viii) Field 10 designates tiie nwlecule designation. Three molecules (designated a. b. and c) of Q-CSF crys- 
tallized together as a unit. TTie designation a. b. or c indicates which coordinates are from which molecule. The 
number after the letter (1 . 2. or 3) indicates the assigned amino add residue position, witti molecule A having 
assigned positions 10-175. molecule B having assigned positions 210-375, and molecule C having assigned 
55 positions 410-575. These positions were so designated so tiiat tiiere would be no overlap among the tiiree 

molecules which crystallized togettier. (The "W" designation indicates water). 

FIGURE 6 is a schematic representation of the sti^ategy involved in refining the aystallization matrix for parameters 
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involved in crystallization. The crystallization nf^atrix corresponds to the final concentration off the components 
(salts, Ixjffers and predpltants) of the aystallization solutions In the wells of a 24 well tissue culture plate. These 
concentrations are produced t>y pipetting the appropriate volume of stock solutions into the wells of the microtiter 
Plata To design the matrix, the crystallographer decides on an upper and lower concentration of the component. 

5 These upper and lower concentrations can be pipetted along either the rows (e.g., A1 -A6. B1 -B6. CI -C6 or D1 -D6) 
or along the entire tray (A1 -D6). The former method is useful for checking reproducibility of crystal growth of a sin- 
gle component along a limited number of wells, whereas the later method is more useful in initial screening. The 
results of several stages of refinement of the crystallization matrix are illustrated by a representation of three plates. 
The increase in shading in the wells indicates a positive crystallization result which, in the final stages, would be X- 

10 ray quality crystals but in the initial stages could be oil droplets, granular precipitates or small crystals approxi- 
mately less than 0.05 nfim in size. Part A represents an initial screen of one parameter in which the range of con- 
centration between the first well (A1) and last well (D6) is large and the concentration Increase between wells is 
calculated as ((concentration A1)-(concentration D6))/23). Part B represents that in later stages of the crystalliza- 
tion matrix refinement of the concentration spread between A1 and D6 would be reduced which would result in 

IS more crystals formed per plate. Part C indicates a final stage of matrix refinement in which quality crystals are 
found in most wells of the plate. 

Detailed DesCTiotion of the Invention 

20 [0057] The present invention grows out of the discovery of the three dimensional structure of Q-CSF. This three 
dimenstonal structure has been expressed via computer program for stereoscopic viewing. By viewing this stereoscop- 
ically. structure-function relationships identified and Q-CSF analogs have been designed and made. 

The Qyotii Three Plronsionfll StTMCtore of Q-CSF 

2S 

[0058] The Q-CSF used to ascertain the structure was a non-glycosylated 1 74 amino acid species having an extra 
N-terminal methionine residue incident to bacterial esq^ression. The DNA and amino acid sequence of this Q-CSF are 
illustrated in FIQURE1. 

[0059] Overall, the three dimensional structure of Q-CSF is predominantly helical, with 103 of the 175 residues 
30 forming a 4-alpha-hencal bundle. The only other secondary structure is fbund in the loop between the first two long hel- 
ices where a 4 residue 3^*^ helix Is immediately followed by a 6 residue alpha helix. As shown in FIQURE 2. the overall 
structure has been conpared with the structure reported for other proteins: growth hormone (Abdel-Meguid et al., 
PNAS-USA fii: 6434 (1987) and Vbs et al.. Science 2SS: 305-312 (1992)). granulocyte macrophage colony stimulating 
factor (Diederichs et al.. Science 251: 1779-1782 (1991). interferon-p (Senda et al.. EMBO J. 11: 3193-3201 (1992)). 
35 interleuKin-2 (McKay Science 25Z: 1673-1677 (1992)) and interleukin-4 (Powers et al.. Science 2SS: 1673-1677 (1992). 
and Smith et al.. J. Mol. Biol. 2Zi. 899-904 (1992)). Structural similarity among these growth factors occurs despite the 
absence of similarity in their amino acid sequences. 

[0060] Presently, the structural infbnnation was conrelation of Q-CSF biochemistry, and this can be summarized as 
follows (with sequence position 1 being at the N-terminus): 

40 





Sequence Position 


Desaiption of Structure 


Analysis 


45 


1-10 


Extended chain 


Deletion causes no loss of biological activity 




Cys18 


Partially buried 


Reactive with DTNB and ThimersososI but 
not with iodo-acetate 




34 


Alternative splice site 


Insertion reduces biological activity 


SO 


20-47 (inclusive) 


Helix A. first disulfide and portion of AB helix 


Predicted receptor binding region based on 
neutralizing antibody data 


55 


20. 23. 24 


Helix A 


Single alanine mutation of residue(s) reduces 
biological activity. Predicted recepta binding 

(Site B). 


165-175 (inclusive) 


Cartxaxy terminus 


Deletion reduces biological activity 
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100611 This biochemical Information, having been gleaned from antibody binding studies, sfifi Layton et al.. Bio- 
chemistry 2fifi: 23815-23823 (1991). was superimposed on the three-dimensional structure in order to design Q-CSF 
analogs. The design, preparation, and testing of these Q-CSF analogs Is described in Example 1 below. 

5 EXAMPLE 1 

[0062] This Exanple describes the preparation of crystalline G-CSF. the visualization of the three dimensional 
structure of recont)inant human Q-CSF via computer-generated image, the preparation of analogs, using site-directed 
mutagenesis or nucleic acid amplification methods, the biological assays and HPLC analysis used to analyze the Q- 
10 CSF analogs, and the resulting determination of overall structure/function relationships. All cited publications are herein 
incorporated by referenca 

A Use of Automated Crystallization 

IS [0063] The need for a three-dimensional structure of recombinant human granulocyte colony stimulating factor (r- 
hu-G-CSF), and the availability of large quantities of the purified protein, led to methods of crystal growth by incomplete 
factorial sampling and seeding. Starting with the implementation of incomplete factorial crystallization described by 
Jancarik and Kim- J. Appl. Crystallogr. 24: 409 (1991) solution conditions that yielded oil droplets and birefringence 
aggregates were ascertained. Also, software and hardware of an automated pipetting system were mxxlif led to produce 

20 some 400 different crystallization conditions per day. Wrt)er, J. Appl. Crystallogr 2Q: 366-373 (1987). This procedure 
led to a crystallization solution which produced r-hu^CSF crystals. 

[0064] The size, reproducibility and quality of the crystals was improved by a seeding method in which the number 
of "nudeation initiating units" was estimated serial dilution of a seeding solution. These methods yielded reproduci- 
ble growth of 2.0 mm r-hu-Q-CSF crystals. The space group of these crystals is P2^Z^2^ with cell dimensions of a=90 
2S A. ba 1 1 0 A and c«49 A. and they diffract to a resolution of 2.0 A. 

i.Qygrail MethoctelooY 

[0065] To search for the crystallizing conditions of a new protein. Carter and Carter. J. Biol. Chem. gS4: 12219- 
30 1 2223 (1 979) proposed the incomplete factorial method. They suggested that a sampling of a large nun*er of randomly 
selected, but generally probable, aystallizing conditions may lead to a successful combination of reagents that produce 
protein crystallization. This idea was implemented by Jancarik and Wm, J. Appl. Crystallogr. 24: 409(1991). who 
described 32 solutions for the initial crystallization trials which cover a range of pH. salts and precipitants. Here we 
desaibe an extension of their implementation to an expanded set of 70 solutions. To minintize the human effort and 
35 error of solution preparation, the method has been programmed for an automatic pipetting machine. 

[0066] Following Weber's method of successive automated grid searching (SAGS). J.Cryst. Growth 9Q: 318- 
324(1988), the robotic system was used to generate a series of solutions which continually refined the crystallization 
conditions of tenperature, pH. salts and precipitant. Once a solution that could reproducibly grow crystals was deter- 
mined, a seeding technique which greatly Improved the quality of the crystals was developed. When these methods 
40 were combined, hundreds of diffraction quality crystals (crystals diffracting to at least about 2.5 Angstroms, preferably 
having at least portions diffracting to below 2 Angstroms, and more preferably, approximately 1 Angstrom) were pro- 
duced in a few days. 

[0067] Generally, the method for crystallization, which may be used with any protein one desires to crystallize, com- 
prises the steps of : 

45 

(a) combining aqueous aliquots of the desired protein with either (i) aliquots of a salt solution, each aliquot having 
a different concentration of salt; or fii) aliquots of a precipitant solution, each aliquot having a different concentration 
of precipitant, optionally wherein each combined aliquot is combined in the presence of a range of pH; 

(b) obsen^ing said combined aliquots for precrystalline fornnations, and selecting said salt or precipitant combina- 
so tion and said pH which is efficacious in producing precrystalline forms, or, if no preaystalline forms are so pro- 
duced, increasing the protein starting concentration of said aqueous aliquots of protein: 

(c) after said salt or said precipitant concentration is selected, repeating step (a) with said previously unselected 
solution in the presence of said selected concentration; and 

(d) repeating step (b) and step (a) until a crystal of desired quality is obtained. 

55 

[0068] The above method may optic.ially be automated, which provides vast savings in time and labor. Preferred 
protein starting concentrations are between lOmg/ml and 20mg/ml, however this starting concentration will vary with 
the protein (the Q-CSF below was analyzed using 33mg/ml). A preferred range of salt solution to begin analysis with is 
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(NaCQ of 0-2.5M. A preferred predpHant ie polyethylene glycol 8000, Kmever. other precipitants include organic 80i- 
vents (such as ethanoQ. polyethylene glycol molecules having a molecular weight in the range of 500-20.000. and other 
precipitants known to those skilled In the art The preferred pH range is pH 4.5 . 5.0. 5.5. 6.0. 6.5. 7.0. 7.5. 8.0. 8.5. and 
9.0. Precrystallization fonns include oils, birefringement precipitants, smalt crystals (< approximately 0.05 mm). 

5 medium crystals (approximately 0.5 to .5 mm) and large crystals (> approximately 0.5 mm). The preferred time for waft- 
ing to see a crystalline structure is 48 hours, although weekly observation is also prefened. and generally, after about 
one month, a different protein concentratton is utilized (generally the protein concentration is increased). Automation is 
prefen-ed, using the Aocuflex system as modified. The preferred automation parameters are described below. 
[0069] Generally, protein with a concentration between 1 0 mgM and 20 mg/ml was combined with a range of NaCI 

10 solutions from 0-2.5 M. and each such combination was performed (separately) in the presence of the above range of 
concentrations. Once a precrystallization structure is observed, that salt concentration and pH range are optimized in 
a separate experiment until the desired crystal quality is achieved. Next, the precipitant concentration, in the presence 
of varying levels of pH is also optimized. When both are optimized, the optimal conditions are performed at once to 
achieve the desired result (this Is diagrammed In FIQURE 6). 

IS 

fl. Implementation of an automated oiPfltHnQ system 

[0070] Drops and resen^r solutions were prepared t>y an Aocuflex pipetting system (iCN Phamiaceuttoals. Costa 
Mesa. CA) which is controlled by a personal computer that sends ASCII codes throu^ a standard serial interface. The 
20 pipetter samples six different solutions by means of a rotating valve and pipettes these solutions onto a plate whose 
translation in a x-y coordinate system can be controlled. The vertical component of the system manipulates a syringe 
that is capable both of dispensing and retrieving liquki. 

[0071 ] The software provided with the Aocuflex was based on the SAGS method as proposed by Cox and Weber. 
J.Appl. Crystaltogr. gQ: 366-373 (1987). This method involves the systematic variation of two nuyor crystallization 
2S parameters, pH and precipitant ooncentratfon, with proviskm to vary two others. While bulMing on these concepts, the 
software used here provided greater flexibility in the design and implementation of the crystallization solutions used in 
' the automated grid searching strategy. As a result of this flexibility the present software also created a larger number of 
different solutions. This Is essential tor the Inplementation of the incomplete factorial method as desaibed in that sec- 
tion below. 

30 [0072] To improve ttie speed and design of the automated grid searching strategy, the Aocuflex pipetting system 
required software and hardware modifications. The hardware changes alfowed the use of two different micrp-titer trays, 
one used for handing drop and one used for sitting drop experiments, and a Plexiglas tray which held 24 additional 
buffer, salt and precipitant solutions. These additional solutions expanded the grid of crystallizing conditions that could 
be surveyed. 

35 [0073] To utilize tiie hardware modifications, the pipetting software was written in two subroutines: one subroutine 
allows the crystallographer to design a matrix of crystallization solutions based on ttie concentrations of their compo- 
nents and the second subroutine to translate these concentrations into the computer code which pipettes the proper 
volumes of the solutions into the crystallization trays. The concentration matrices can t>e generated by erttter of two pro- 
grams. The first program (MRF. available from Amgen, Inc.. Thousand Oaks. CA) refers to a list of stock solution con- 

40 centrations supplied by the crystallographer and calculates the required volume to be pipette to achieve the designated 
concentration. The second method, which is preferred, incorporates a spread sheet program (Lotus ) which can be 
used to make more sophisticated gradients of predpHants at pH. The concentration matrix created by either program 
is interpreted by the confrol program (SUX. a nxxlification of the program found in the Accuflex pq}etter originally and 
available from Amgen. Inc. , Thousand Oaks. CA) and the wells are filled accordingly. 

45 

b. Imolementation of the Incomplete Factorial Method 

[0074] The convenience of the modified pipetting system for preparing diverse solutions improved the inplementa- 
tion of an expanded incomplete factorial method. The development of a new set of crystallization solutions having "ran- 

50 dom" components was generated using the program INFAC. Carter et aL. J.Cryst. Qrowtti 9Q: 60-73(1988) which 
produced a list containing 96 random combinations of one facta from three variables. Combinations of calcium and 
phosphate which immediately precipitated were eliminated, leaving 70 distinct combinations of predpitants, salts and 
buffers. These combinations were prepared using the automated pipetter and incubated tor 1 week. The mixtures were 
Inspected and solutions which formed precipitants were prepared again wfth lower concentrations of their components. 

55 This was repeated until all wells were clear of precipitant. 
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r nryBtflllaAtjonoff-hu-Q^SF 

[0075] Several different crystallization strategies were used to find a solution which produced x-ray quality crystals. 
These strategies included the use of the incomplete factorial method, refinement of the crystallizaton conditions using 

5 successive automated grid searches (SAGS), implementation of a seeding technique and development of a crystal pro- 
duction procedure which yielded hundreds of quality crystals overnight Unless otherwise noted the saeening and pro- 
duction of r-hu-Q-CSF crystals utilized the hanging drop vapor diffusion method. Afinsen et al.. Physical principles of 
protein crystallization. In: Eisent>erg (ed.). Advances in Protein Chemistry 41: 1-33 (1991). 
[0076] The initial screening for crystallization conditions of r-hu^-CSF used the Jancarik and Kim. J.Appl.Cry8tal- 

10 logr. 24: 409(1991} incomplete factorial method which resulted in several solutions that produced Iprecrystallization'' 
results. These results included birefringent precipitants. oils and very small aystals (< .05 mm). These precrystalliza- 
tions solutions then served as the starting points for systematic screening. 

[0077] The screening process required the development of aystallization nuttrices. These matrices corresponded 
to the concentration of the components in the crystallization solutions and were created using the IBM-PC based 

75 spread sheet Lotus^ and implemented with the modified Aoquf lex pipetting system. The strategy in designing the matri- 
ces was to vary one crystallization condition (such as salt concentration) while holding the other conditions such as pH. 
and precipitant concentration constant At the start of screening, the concentration range of the varied condition was 
large but the concentration was successively refined until all wells in the micro-titer tray produced the same crystalliza- 
tion result These results were scored as follows: crystals, birefringement precipitate, granular precipitate, oil droplets 

20 and amorphous mass. If the concentration of a crystallization parameter did not produce at least a precipitant, the con- 
centration of that parameter was inaeased until a precipitant fbnned. After each tray was produced, it was left undis- 
turbed for at least two days and then inspected for crystal growth. After this initial screervng. the trnys were then 
inspected on a weekly basis. 

[0078] From this screening process, two independent solutions with the same pH and precipitant but differing in 
25 salts (MgCI. USO4) were Identified which produced small (0.1 x 0.05 x 0.05 mm) crystals. Based on these results, a 
new series of concentration matrices were produced which varied MgCI with respect to IJSO4 while keeping the other 
crystallization parameters constant. This series of experiments resulted in identification of a solution which produced 
diffraction quality aystals (> approximately 0.5 mm) in about three weeks. To find this aystallization growth solution 
(100 mM Mes pH 5.8. 380 mM MgCl2t 220 mM US04 and 8% PEQ 8k) approximately 8.000 conditions had been 
30 screened which consumed about 300 mg of protein. 

[0079] The size of ttie crystals depended on the number of aystals forming per drop. Typically 3 to 5 crystals woukJ 
be formed with average size of (1.0 x 0.7 x 0.7 mm). Two maphologies which had an identical space group (P2^2^2i) 
and unit cell dimensions a-90.2. bsl 10.2. cb49.5 were obtained depending on whether or not seeding (see below) was 
implemented. Without seeding, the r-hu^-CSF crystals had one long flat surface and rounded edges. 
35 [0080] When seeding was employed, crystals with sharp faces were observed in the drop witiiin 4 to 6 hours (0.05 
by 0.05 by 0.05 mm). Wrthin 24 hours, aystals had grown to (0.7 by 0.7 by 0.7 mm) and continued to grow beyond 2 
mm depending on tiie nurTt>er of crystals forming In the drop. 

d. Sewlinfl and deterrriinfltipn of nudeation inltiatton sites. 

40 

[0081] The presentiy provided method for seeding crystals establishes the mrrber of nudeation initiation units in 
each individual well used (here, after the optinujm conditions for growing aystals had been determined). The method 
here is advantageous In that the number of "seeds" affects the quality of the crystals, and this in turn affects the degree 
of resolution. The present seeding here also provides advantages In tiiat witii seeding. Q-CSF crystal grows in a period 

45 of about 3 days, whereas without seeding, the growth takes approximately three weeks. 

[0082] In one series of production growth (see methods), showers of small but well defined aystals were produced 
overnight (<:0.01 x 0.01 xO.01 mm). Crystallization conditions were followed as described above except that a pipette 
tip employed in previously had been reused. Presumably, the crystal showering effect was caused by small nudeation 
units which had formed in the used tip and which provided sites of nudeation for the crystals. Addition of a small amount 

so (0.5 ul) of ttie drops containing tiie crystal showers to a new drop under standard production growtti conditions resulted 
in a shower of aystals overnight. This method was used to produce several trays of drops containing aystal showers 
which we termed "seed stod^. 

[0083] The number of nudeation initiation units (NIU) contained within the 'seed stock" drops was estimated to 
attempt to improve the reproducibility and quality of the r-hu-GCSF crystals. To determine the number of NIU in the 
55 "seed stock", an aliquot of the drop was serially diluted along a 96 well miaotiter plate. The microtiter plate was pre- 
pared by adding 50 ul of a solution containing equal volumes of r-hu-G-CSF (33 mg/ml) and the crystal growth solution 
(described above) in each well. An aliquot (3 ul) of one of the "seed stock" drops was transferred to the first well of the 
microtiter plate. The solution In the well was mixed and 3 ul was then transfen'ed to the next well along the row of the 
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microtiter plate. Each row of the miaomef plate wae similarly prepared and the tray was sealed with plastic tape. Over- 
night, small crystals formed In the bottom of the wells of the microtiter plate and the number of aystals in the wells were 
correlated to the dilution of the original "seed stock". To produce large single crystals, the "seed stock" drop was appro- 
priately diluted into fresh OQS and then an aliquot of this solution containing the NIU was transferred to a drop 
5 [0084] Once crystallization conditions had been optimized, crystals were grown in a production method in which 3 
ml each of CQS and r-hu-G-CSF (33 mg/ml) were mixed to create 5 trays (each having 24 wells). This method Included 
the production of the refined crystallization solution in liter quantities, mixing this solution with protein and piecing the 
protein/crystallization solution in either hanging drop or sitting drop trays. This process typically yielded 1 00 to 300 qual- 
ity crystals (>0.5 mm) in about 5 days. 

10 

e. Experimental Methods 
Materials 

IS [0085] Crystallographic information was obtained starting with r-hu-met-Q-CSF with the amino acid sequence as 
provided in FIGURE 1 with a specific activity of 1 .0 0.6 x lO^UAng (as measured by cell mitogenests assay in a 10 
mM acetate buffer at pH 4.0 (m Wiater for Injection) at a oonoentratran of approximately 3 mgAnI solution was concen- 
trated with an Antioon concentrator at 75 psi using a YM10 fflter. The solutkm was typically concentrated 10 fold at 4<*C 
and stored for several months. 

20 

Initial Screening 

[0066] Crystals suitable for X-ray analysis were obtained by vapor-diffusion equilibrium using hanging drops. For 
preliminary screening. 7 ul of the protein solution at 33 mgAnI (as prepared above) was mixed with an equal volume of 

25 the well solution, placed on siliconized glass plates and suspended over the well solution utilizing Linbro tissue culture 
plates (Row Laboratories. McLean. Va). All of the pipetting was perfonned with the Accufiex pipetter. however, trays 
were removed from the automated pipetter after the well solutions had been aeated and thoroughly mixed for at least 
10 minutes with a table top shaker. The Linbro trays were then returned to the pipetter which added the well and protein 
solutions to the siliconized cover slips. The cover slips were then inverted and sealed over 1 ml of the well solutions with 

30 silicon grease. 

[0087] The components of the automated crystallization system are as follows. A PC-DOS computer system was 
used to design a matrix of crystallization solutions based on the concentration of their components. These matrices 
were produced with either MRF of the Lotus spread sheet (described above). The final product of these programs is a 
data file. This file contains the information required by the SUX program to pipette the appropriate volume of the stock 

3s solutions to obtain the concentrations described in the matrices. The SUX program information was passed through a 
serial I/O port and used to dictate to the Accufiex pipetting system the position of the valve relative to the stock sdu- 
ti ns. the amount of solution to be retrieved, and then pipetted into the wells of the microtitor plates and the X-Y position 
of each well (the column/row of each well). Addition information was transmitted to the pipetter which included the Z 
position (height) of ttie syringe during fOling as well as the position of a drain where ttie system pauses to purge the 

40 syringe between fillings of different solutions. The 24 well miaotiter plate (either Linbro or Cryschem) and cover slip 
holder was placed on a plate which was nKived in the X-Y plane. Movement of the plate alkiwed the pipetter to position 
the syringe to pipette into the wells, ft also positioned the ooversllps and vials and extract solutions from these sources. 
Prior the pipetting, the Unbro microtiter plates had a thin film of grease applied around the edges of the wells. After the 
crystallization solutions were prepared in the wells and before they were transferred to the cover slips, the microtiter 

45 plate was removed from the pipetting system, and solutions were allowed to mix on a table top shaker for ten minutes. 
After mixing, the well solution was either transfon^ed to tiie cover slips (in the case of the hanging drop protocoQ or trans- 
ferred to the middle post in ttie well (in the case of the sitting drop protocol). Protein was extracted from a vial and added 
to the coverslip drop containing the well solution (or to the post). Plastic tape was applied to the top of the Cryschem 
plate to seal the wells. 

so 

Production Growth 

[0088] Once conditions for aystallization had been optimized, crystal growth was periormed utilizing a "production" 
method. The crystallization solution which contained 100 mM Mes pH 5.8. 380 mM Mga2. 220 mM LiS04. and 8% 
55 PEG 8K was made in 1 liter quantities. Utilizing an Eppindorf syringe pipetter. 1 ml aliquots of this solution were pipetted 
into each of the weiis of the Urtoro piate. A solution containing 50% of tills solution and 50% G-CSF (33 mg/ml) was 
mixed and pipetted onto tiie siliconized cover slips. Typical volumes of these drops were between 50 and 100 ul and 
because of the large size of these drops, great care was taken in flipping the coverslips and suspending the drops over 
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the wells. 
Data Collection 

5 [0089] The structure has been refined with X-PLOR (Bruniger, X-PLOR version 3.0, A system for aystailography 
and NMR. >^le University. New Haven CT) against 2.2A data collected on an R-AXIS (Molecular Structure, Corp. Hou- 
ston, TX) imaging plate detector. 

f ■ Qbservations 

10 

[0090] As an effective recombinant human therapeutic, r-hu-Q-CSF has been produced in large quantities and 
gram levels have been made available for structural analysis. The crystallization methods provided herein are likely to 
find other applications as other proteins of interest become available. This method can be applied to any crystallo- 
graphic project which has large quantities of protein (approximately >200 mg). As one skilled in the art will recognize. 
IS the present materials and methods may be modffied and equivalent materials and methods may be available for crys- 
tallization of other proteins. 

R Comouter Prooram For VisualizinQ The Three DImenstonal Structure of G-CSF 

20 [0091] Although diagrams, such as those in the Figures herein, are useful for visualizing the three dimensional 
structure of G-CSF, a computer program which allows for stereoscopic viewing of the niolecule is contemplated as pre- 
ferred. This stereoscopic viewing, or Virtual reaHty" as those In the art sometimes refer to it allows one to visualize the 
structure in its three dimensional form from every angle in a wide range of resolutton. from macronwlecular structure 
down to the atomic level. The computer programs contemplated herein also allow one to change perspective of the 

25 viewing angle of the molecule, for example by rotating the molecule. The contemplated programs also respond to 
changes so that one may. for example, delete, add. or substitute one or more images of atoms, including entire amino 
acid residues, or add chemical moieties to existing or sut>stituted groups, and visualize the change in structure. 
[0092] Other computer based systems may be used: the elements being: (a) a means for entering information, 
such as orthogonal coordinates or other nunrtertcalty assigned coordinates of the three dimensional structure of Q-CSF; 

30 (b) a means for expressing such coordinates, such as visual means so that one may view the three dimensional struc- 
ture and correlate such three dimensional structure with the composition of the G-CSF molecule, such as the amino 
acid composition; (c) optionally, means for entering information which alters the composition of the G-CSF molecule 
expressed, so that the image of such three dimensional structure displays the altered composition. 
[0093] The coordinates for the preferred computer program used are presented in FIGURE 5. The preferred com- 

35 puter program is Insight II. version 4, available from Blosym In San Diego. CA. For the raw crystallographic structure, 
the observed intensities of the diffraction data ("F-obs") and the orthogonal coordinates are also deposited In the Pro- 
tein Data Bank. Chemistry Department. Brookhaven National Laboratory, Upton. New >brk 19723, USA and these are 
herein incorporated by reference. 

[0094] Once the coordinates are entered into the Insight II program, one can easily display the three dimensional 
40 G-CSF molecule representation on a computer screen. The preferred computer system for display is Silicon Graphics 
320 VGX (San Diego. CA). For stereoscopic viewing, one may wear eyewear (Crystal Eyes. Siltoon Graphics) which 
allows one to visualize the G-CSF molecule in three dimensions stereoscopically. so one may turn the nralecule and 
envision molecular design. 

[0095] Thus, the present Invention provides a method of designing or preparing a G-CSF analog with the akJ of a 
45 computer comprising: 

(a) providing said conputer with the means for displaying the three dimensional structure of a G-CSF molecule 
including displaying the composition of moieties of said G-CSF nfK)lecule, preferably displaying the three dimen- 
sional location of each amino acid, and more preferably displaying the three dimensional location of each atom of 

50 a G-CSF molecule; 

(b) viewing said display: 

(c) selecting a site on said display for alteration in the composition of said molecule or the location of a moiety; and 

(d) preparing a G-CSF analog with such alteration. 

55 [0096] The alteration may be selected based on the desired structural characteristics of the end-product G-CSF 
analog, and considerations for such design are described in rnore detail below. Such considerations include the location 
and compositions of hydrophobic amino ackJ resklues. particularly residues internal to the helical structures of a G-CSF 
molecule which residues, when altered, alter the overall structure of the internal core of the molecule and may prevent 
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receptor binding; the location and compoeitions of external loop structures. aKeration of which may not affect the overall 
structure of the Q-CSF niolecule. 

[0097] FIGURES 2-4 illustrate the overall three dimensional conformation in cfifferent ways. The topological dia- 
gram, the ribbon diagram, and the barrel diagram all illustrate aspects of the contomiation of Q-CSF 
5 [0098] FIGURE 2 illustrates a comparison between G-CSF and other molecules. There is a similarity of architec- 
ture, although these growth factors differ in the local confomuttions of their loops and bundle geometries. The up-up- 
down-down topology with two long crossover connections is conserved, however, among all six of these nK)lecules. 
despite the dissimilarity in amino add sequence. 

[0099] FIGURE 3 illustrates in more detail the secondary structure of recombinant human G-CSF This rtobon da- 
10 gram illustrates the handedness of the helices and their positions relative to each other. 

[0100] FIGURE 4 illustrates in a different way the confonnation of recombinant human G-CSR This tarrel" diagram 
illustrates the overall architecture of recombinant hunran Q-CSF 

c. Prflparation of Analogs Using M18 Mutagenefiifi 

16 

[0101] This example relates to the preparation of Q-CSF analogs using site directed mutagenesis techniques 
involving the single stranded bacteriophage Ml 3. according to methods published in PCT Application No. WO 85/0081 7 
(Souza et al.. published February 28. 1985. herein incorporated by reference). This method essentially involves using 
a single-stranded nucleic acid template of the non-mutagenized sequence, and binding to it a smaller oligonucleotide 

20 containing the desired change in the sequence. Hybridization conditions allow for non-identtcal sequences to hybridize 
and the remaining sequence is fined In to be identical to the original template. What results is a double stranded mole- 
cule, with one of the two strands containing the desired change. This mutagenized single strand is separated, and used 
itself as a template for its complementary strand. This aeates a double stranded nxxlecule with the desired change. 
[0102] The original G-CSF nucleic acid sequence used is presented In FIGURE 1 . and the oligonucleotides con- 

2s taining the mutagenized nucleic acid(s) are presented in Table 2. Abbreviations used herein for amino acid residues and 
nucleotides are conventional, sfifi Stryer, Biochemistry, 3d W.H. Freeman and Company, N.Y, N.Y. 1988, inside 
back cover. 

[0103] The original G-CSF nucleic add sequence was first placed into vector M13mp21. The DNA from single 
stranded phage M13mp21 containing tiie original G-CSF sequence was then isolated, and resuspended in water. For 
30 each reaction, 200 ng of this DNA was mixed with a 1.5 pnrK>le of phosphorylated oligonucleotide (Table 2) and sus- 
pended in 0.1 M Iris. 0.01M MgCl2. 0.005M DTT, O.ImM ATP. pH 8.0. The DMAs were annealed by heating to 65*C and 
slowly cooling to room temperature. 

[0104] Once cooled. 0.5mM of each ATP, dATP. dCTP. dGTP. TTP, 1 unit of T4 DNA ligase and 1 urwt of Wenow frag- 
ment of £. {^polymerase 1 were added to the 1 unit of annealed DNAinO.IMTris. 0.025M NaCt, 0.01M Mga2. 0.01M 

35 DTT, pH 7.5. 

[0105] The now double stranded, closed circular DNA was used to transfect coli witinout further purification. 
Plaques were screened by lifting the plaques with nitrocellulose filters, and ttien hybridizing the filters with single 
stranded DNA end-labeled with P^^ fa 1 hour at 55-€0''C. After hybridization, ttie filters were washed at O-S^'C below 
the melt temperature of the oligo (2<'C for A-T. 4*'C for G-C) which selectively left autaadiography signals corresponding 

40 to plaques with phage containing the mutated sequence. Positive clones were confinned by sequencing. 

[0106] Set forth below are tiie oligonucleotides used for each G-CSF analog prepared via the M13 mutagenesis 
method. The nomenclature indicates the residue and tiie position of the original amino acid (e.g.. Lysine at position 17), 
and the residue and position of the substituted amino add (e.g.. arginine 17). A substitution involving more than one 
residue is indicated via superscript notation, witti commas between tine noted positions or a semicolon indicating differ- 

45 ent residues. Deletions with no substitutions are so noted. The oligonucleotide sequences used for M13-based muta- 
genesis are next indicated; these oligonucleotides were manufactured synthetically, although the mettiod of preparation 
is not critical, any nucleic acid synthesis metiiod and/or equipment may be used. The lengtti of ttie oligo is also indi- 
cated. As indicated above, these oligos were allowed to contact the single stranded phage vector, and then single 
nucleotides were added to complete the G-CSF analog nucleic acid sequence. 

so 
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D PrflpAration of G-CS F Analogs Using DMA Amplification 

[01 07] This example relates to methods for producing Q-CSF analogs using a DNA amplification technique. Essen- 
tially. DNA encoding each analog was amplified in two separate pieces, combined, and then the total sequence itself 

5 amplified. Depending upon where the desired change in the original G-CSF DNA was to k)e made, internal primers were 
used to incorporate the change, and generate the two separate amplified pieces. For example, fa amplification of the 
5' end of the desired analog DNA. a 5* flanking primer (complementary to a sequence of the plasmid upstream from the 
G-CSF original DNA) was used at one end of the region to be amplified, and an internal primer, capable of hybridizing 
to the original DNA but incorporating the desired change, was used for priming the other end. The resulting amplHied 

10 region stretched from the 5* flanking primer through the internal primer. The same was done for the 3* temninus. using 
a 3' f lanidng primer (complementary to a sequence of the plasmid downstream from the G-CSF original DNA) and an 
internal primer complementary to the region of the intended nujtation. Once the two "halves" (which may or may not be 
equal in size, depending on the location of the intemal primer) were amplified, the two "halves" were allowed to connect. 
Once connected, the 5* f lanWng primer and the 3' flanking primer were used to amplify the entire sequence containing 

IS the desired change. 

[0108] If more than one change is desired, the above process may be modified to incorporate the change into the 
internal primer, or the process may be repeated using a different intemal primer. Alternatively, the gene amplification 
process may be used with other methods for creating changes in nucleic acid sequence, such as the phage based 
mutagenesis technique as desaibed above. Examples of process for preparing analogs with more than one change are 
so described below. 

[0109] To create the G-CSF analogs described below, the template DNA used was the sequence as in FIGURE 1 
plus certain flanking regions (from a plasmid containing the G-CSF coding region). These flanking regions were used 
as the 5' and 3* flanking primers and are set forth below. The amplification reactions were performed in 40 ul volumes 
containing lOmMTris-HCI, I.SmMMgClg.SOmMKCI. 0.1 mgAnI gelatin. pH 8.3at20*C. The 40 ul reactions also con- 

2S tained 0.1 mM of each dNTP. 1 0 pmdes of each primer, and 1 ng of template DNA. Each amplification was repeated for 
15 cycles. Each cycle consisted of 0.5 minutes at 94*»C, 0.5 minutes at 50'C. and 0.75 minutes at 72*C. Flanking prim- 
ers were 20 nucleotides in length and intemal primers were 20 to 25 nucleotides in length. This resulted in multiple cop- 
ies of double stranded DNA encoding either the front portion or the back portion of the desired G-CSF analog. 
[01 1 0] For combining the two "halves," one fbrtieth of each of the two reactions was combined in a third DNA ampli- 

30 fication reaction. The two portions were allowed to anneal at the internal primer location, as their ends bearing the 
mutation were conrplementary, and following a cycle of polymerization, give rise to a full length DNA sequence. Once 
so annealed, the whole analog was amplified using the 5' and 3* flanking primers. This amplification process was 
repeated for 15 cycles as described above. 

[01 1 1 ] The completed, anplif ied analog DNA sequence was cleaved with Xbai and Xhol restriction endonudease 
35 to produce cohesive ends for insertion into a vector. The cleaved DNA was placed into a plasmid vector, and that vector 
was used to transform £. cdi, Transformants were challenged with kanamydn at 50 ug/ml and incubated at 30'C. Pro- 
duction of G-CSF analog protein was confirmed by pdyacrylamide gei electrophoresis of a whole cell lysate. The pres- 
ence of the desired mutation was confirmed by DNA sequence analysis of plasmid purified from the production isolate. 
Cultures were then grown, and cells were harvested, and the G-CSF analogs were purified as set fbrth below. 
40 [01 1 2] Set forth below in Table 3 are the specific primers used for each analog made using gene amplification. 



Table 3 



Analog Seq. ID 


Internal Primer(5*->3') 




His^->Ala*^ 


5t)rimer-TTCCGGAGCGCACAGTTTG 


49 




3'primer.CAAACTGTGGGCTCCGGAAGAGC 


50 


Thrii7.>Ala^i7 


5'primer-ATGCCAAATTGCAGTAGCAAAG 


51 




3'primer-CTTTGCTACTGCAATTTGGCAACA 


52 


Asp^io.>Ala^^o 


5'primer-ATCAGCTACTGCTAGCTGCAGA 


53 




3'primer-TCTGCAGCTAGCAGTAGCTGACT 


54 


GIn2^.>Ala2^ 


5lprimer-TTACGAACCGCTTCCAGACATT 


55 




3*primer-AATGTCTGGAAGCGGTTCGTAAAAT 


56 




EP0 612 846B1 



Table 3 (continued) 



Analog Seq. ID 


Internal Primer(5'*>3') 




A8p"3->Ala"3 


5lprimer-GTA£3K}AAATQCAQCTACATCTA 


57 




3t)rimer-TAGUaiQTAGCTQCATTTQCTACTAC 


58 


Hi853->Ala*3 


5t)rimer-CCAAQAQAAQCACCCAQCAQ 


59 




3'primer-CTQCTQGQTQCTTCTCTTQQQA 


60 


For each analog, the following 5* flanking primer was used: 


5'-CACTQQCQQTQATAATQAQC 


., 


For each analog, the fbllowing 3' flanking primer was used: 




3'-QQTCATTACGQACCGQATC 


62 



1 Conslmction of Doutale Mutation 

[0113] To make Q-CSF analog Qln^^^i.^^yia^i^ ^ separate DNA amplifkations were conducted to create the 

20 two DNA mutations. The template DNA used was the sequence as in FIGURE 1 plus certain flanking regions (from a 
plasmid containing the Q-CSF coding region). The precise sequences are fisted betew. Each of the two DNA amplifica- 
tion reactions were earned out using a PerWn Elmer/Cetus DNA Thermal Cycler. The 40 ul reaction mix consisted of 1X 
PGR Buffer (Cetus). 0.2 mM each of the 4 dXTPs (Cetus). 50 pnrwies of each primer oligonucleotide, 2 ng of G-CSF 
template DNA (on a plasmid vector), and 1 unit of Taq polymerase (Cetus). The amplification process was canried out 

25 for 30 cycles. Each cycle consisted of 1 minute at 94«C. 2 minutes at 50<'C. and 3 minutes at 72<'C. 
. [0114] DNAamplifk»tion"A"u8edtheoligor%JCleotMes: 
5* CCACTGGCGGTQATACTQAGC 3' (Seq. ID 63) and 
5* AGCAGAAAQCTTTCCGGCAQAQAAQAAQCAGQA 3' (Seq. ID 64) 
[01 1 5] DNA anplif ication "B" used the oligonucleotides: 

30 5' GCCGCAAAQCTTTCTGCTQAAATGTCTGQAAGAGGTTCQTAAAATCCAGGQT(3A 3* (Seq. ID 65) and 
5' CTGQAATGCAGAAGCAAATGCCQGCATAGCACCTTCAGTCQGTTGCAQAGCTGGTGCCA 3' (Seq. ID 66) 
[01 1 6] From the 1 09 base pair double stranded DNA product obtained after DNA amplification "A", a 64 base pair 
Xbat to Hindlll DNA fragment was cut and isolated that contained the DNA mutation Qln^2->Qlu^^. From the 509 base 
pair double stranded DNA product obtained after DNA amplification "B". a 197 base pair Hindlll to BsmI DNA fragment 

35 was cut and isolated that contained the DNA nuitation Qln2^->Glu^^ . 

[01 1 7] The "A" and "B" fragments were iigated together with a 4.8 Wlo-base pair Xbal to BsmI DNA plasmid vector 
fragment. The ligation mix consisted of equal mdar DNA restriction fragments, ligation buffer (25 mM Tris-HCI pH 7.8, 
10 mM MgCta, 2 mM DTT, 0.5 mM rATP. and 100 ug/hil BSA) and T4 DNA ligase and was incubated overnight at U^'C. 
The Iigated DNA was then transformed into £. ss& FM5 celts by electroporation using a Bio Rad Gene Pulsar apparatus 

40 (BioRad, Richmond, CA). A clone was isolated and the plasmid construct verified to contain the two mutations by DNA 
sequencing. This 'intermediate' vector also contained a deletion of a 193 base pair BsmI to BsmI DNA fragment. The 
final plasmid vector was constructed by ligation and transformation (as described above) of DNA fragments obtained 
by cutting and isolating a 2 kilo-base pair SstI to BamHI DNA fragment from the intermediate vector, a 2.8 KtDp SstI to 
EcoRI DNA fragment from the plasmid vector, and a 360 bp BamHI to EcoRI DNA fragment from the plasmid vector. 

45 The final construct was verified by DNA sequencing the Q-CSF gene. Cultures were grown, and the cells were har- 
vested, and the G-CSF analogs were purified as set forth below. 

[01 1 8] As indicated above, any combination of mutagenesis techniques may be used to generate a G-CSF analog 
nudeic acid (and expression product) having one or more than one alteration. The two examples above, using M13- 
based mutagenesis and gene amplification-based mutagenesis, are illustrative. 

so 

E. Expression of Q-CSF Analog DNA 

[01 1 9] The G-CSF analog DNAs were then placed into a plasmid vector and used to transform CQli strain FM5 
(ATCC#5391 1). The present G-CSF analog DNAs contained on plasmids and in bacterial host cells are available from 
55 the American Type Culture Collection, Rockville. MD, and the accession designations are indicated below. 

[0120] One liter cultures were grown in broth containing lOg tryptone, 5g yeast extract and 5g NaCI) at 30''C until 
reaching a density at A^ of 0.5, at which point they were rapkJIy heated to 42*C. The flasks were allowed to continue 
shaking at for three hours. 
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[01 21 ] Other proKaryotic or eukaryotic host cells may also be used, such as other bacterial celts, strains or species, 
mammalian cells In culture {COS, CHO or other types) insect cells or mutticellular organs or organisms, a plant celts 
or multicetlutar organs or organisms, and a skilled practitioner will recognize the appropriate host. The present G-CSF 
analogs and related compositions may also be prepared synthetically, as. for exanple, by solid phase peptide synthesis 
5 methds, or other chemical manufacturing techniques. Other cloning and expression systems will be apparent to those 
skilled in the art 

P Piirif teatirm of G^F Analog Protein 

10 [0122] Cells were harvested by centrifugatlon (10.000 x G, 20 minutes. 40C). The pellet (usually 5 grams) was 
resuspended in 30 ml of 1mM DTT and passed three times through a French press cell at 10.000 psi. The broken cell 
suspension was centrifuged at lO.OOOg for 30 minutes, the supernatant removed, and the pellet resuspended in 30-40 
ml water. This was recentrifuged at 10.000 x Q for 30 minutes, and this pellet was dissolved in 25 ml of 2% Sarkosyl 
and 50mM Tris at pH 8. Copper sulfate was added to a concentration of 40uM. and the mixture was allowed to stir for 

f5 at least 15 hours at 15-25''C. The mixture was then centrifuged at 20.000 x Q for 30 minutes. The resultant solubilized 
protein mixture was diluted four-foM with 13.3 mM Tris. pH 7.7. the Sarkosyl was removed, and the supernatant was 
then applied to a DEAE-ceHulose (Whatman DE*52) column equH»)rated in 20mM Tris. pH 7.7. After k)ading and wash- 
ing the column with the same buffer, the analogs were eluted with 20mM Tris fUaO (between 35mM to lOOmM depend- 
ing on the analog, as indicated below). pH 7.7. For most of the analogs, the eluent from the DEAE column was adjusted 

20 to a pH of 5.4. with 50% acetic acid and diluted as necessary (to obtain the proper conductivity) with 5mM sodium ace- 
tate pH 5.4. The solution was then toaded onto a CM-sepharose column equilibrated in 20 mM sodium acetate. pH 5.4. 
The column was then washed with 20mM NaAc, pH 5.4 until the absorbance at 280 nm was approximately zero. The 
G*CSF analog was then eluted with sodium aoetate/NaO in concentrations as described below in Table 4. The DEAE 
column eluents for those analogs not applied to the CM-sepharose column were dialyzed directly into lOntM NaAc, ph 

25 4.0 buffer. The purified G-CSF analogs were then suitably Isolated for in yitefi analysis. The salt concentrations used for 
eluting the analogs varied, as noted above. Below, the salt concentrations for the DEAE cellulose column and for the 
CM-sepharose column are listed: 



30 



Tafale 1 

fialt, gQne<>nf, rat: lens 



35 



Analog 



Lysl'^->Argl'' 
Lys24->Arg24 

Lys35->Arg35 
Lys41->Arg41 

>Argl7,24,35 
Lysl7,35,41. 
>Argl7,35,41 



35inM 
35inM 



37.5inM 
37,5inM 



35inM 



37.5inM 



40 



35inM 
35inM 



37,5inM 
37.5inM 



45 



35inM 



37.5mM 



so 



55 
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Table 4 Con'ti 





DSAE Cellulose 


CM-Seoharose 


Tt7c24, 35, 41- 


35xnM 


37.5inM 








Tvsl7,24,35,41 


35inM 


37.5mM 


->Aral"7f 24,35, 41 






Tvsl7,24,41- 


35inM 


37.5mM 


>aral7,24,41 






Gin68— >Glu68 


60inM 


37 * 5flM 


ri/<37, 43~>QAr37, 43 


40inM 


37.SmM 


d n2€— ^Al a26 


40RiM 


40inM 


Glnl74-.>iilal74 


40inM 


40inM 




40inM 


40mM 




40xnM 


40inM 




N/A 


N/A 


L.vs41_>iila41 


ISOnM 


40niM 


His44— >Lvs44 


40inM 


60inM 


Gln47— >aia47 


40inM 


40inM 


Aro23«.>Ala23 


40xnM 


40mM 


T vs24-.>i^ia24 


120inM 


40mM 


Glu20->Ala20 


40mM 


60inM 




40inM 


BOmM 


Metl27_>Giul27 


80mM 


40mM 




BOmM 


40mM 


Metl27_>i,eul27 


40iaM 


40tnM 


Metl38_>i^ul38 


40inM 


40inM 


Cysl8->Alal8 


40xnM 


37.5inM 


Glnl2,21_>Giul2,21 


eomM 


37.5mM 


Ginl2,21,68- 


eOmM 


37.5mM 


>Giul2,21, 68 






Glu20->Ala20; 












->Glyl3 


40mM 


BOmM 
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Table 4 Con't 



10 



IS 



20 



25 



ftnaloq 








40inn 




>Leul27,138 






Ser^''->Ala''"' 


4 vinn 


40inM 






40mM 


Glnl21->Ala"l 


40inn 




o 1 » • 91 


K AtnfUf 

doinn 




HiS^4_>Ala44** 


40inM 


N/A 


His53->Ala53- 


50nM 


N/A 


AspllO->AlallO** 


40inM 


N/A 


Aspll3->Alall3** 


40nM 


N/A 




SOnM 


N/A 


Asp28->Ala28; 


SOinM 


N/A 








AlallO** 






Glul24->Alal24** 


40niM 


40inM 



30 



* For Deletion i^'^, the data are unavailable. 

** For these analogs, the DEAE cellulose column alone 

was use for purification. 



35 



[01 23] The above purification methods are Illustrative, and a skilled practitioner will recognize that other means are 
availat)le for obtaining the present G*CSF analogs. 



40 G. Biological Assays 

[01 24J Regardless of which methods were used to create the present Q-CSF analogs, the analogs were subject to 
assays for biological activity. Tritiated thymidine assays were conducted to ascertain the degree of cell division. Other 
biological assays, however, may be used to ascertain the desired activity. Biological assays such as assaying for the 

45 ability to induce terminal differentiation in mouse WEHI-3B (D+) leukemic cell line, also provides indication of G-CSF 
activity. Sfifi Nicola, et al.. Blood Si: 614-27 (1979). Other in ^ assays may be used to ascertain biological activity. 
Sfifi Nicola, Annu. Rev, Biochem. 5fi: 45-77 (1989). In general, the test for biological activity should provide analysis for 
the desired result, such as increase or decrease in biological activity (as compared to non-altered G-GSF), different bio- 
logical activity (as compared to non-altered G-CSF). receptor affinity analysis, or serum half-life analysis. The list is 

so Incomplete, and those skilled in the art will recognize other assays useful for testing for the desired end result. 

[0125] The ^H-thymidine assay was performed using standard methods. Bone marrow was obtained from sacri- 
ficed female Balb C mice. Bone marrow cells were briefly suspended, centrlfuged, and resuspended in a growth 
medium. A 160 ul aliquot containing approximately 10,000 cells was placed Into each well of a 96 well micro-titer plate. 
Samples of the purified G-CSF analog(as prepared above) were added to each well, and Incubated for 68 hours. Triti- 

55 ated ttiymidine was added to the wells and allowed to incubate for 5 additional hours. After the 5 hour incubation time, 
the cells were harvested, filtered, and thoroughly rinsed. The filters were added to a vial containing scintillation fluid. 
The beta emissions were counted (LKB Betaplate scintillation counter). Standards and anatogs were analyzed in tripli- 
cate, and samples which fell substantially above or below tiie standard curve were re-assayed with the proper dilution. 
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The results reported here are the average of the triplicate analog data relative to the unaltered recombinant human Q- 
CSF standard results. 



H. HPLC Analysis 

5 

[0126] High pressure liquid chromatography was performed on purified samples of analog. Although peak position 
on a reverse phase HPLC column is not a definitive Indication of structural similarity t>etween two proteins, analogs 
which have similar retention times m^ have the same type of hydrophobic interactions with the HPLC column as the 
non-altered molecule. This is one indication of an overall simiiar stmcture. 

10 [0127] Samples of the analog and the non-aKered recombinant human Q-CSF were analyzed on a reverse phase 
(0.46 X 25 cm) Vydac 214TP54 column (Separations Group. Inc. Hesperia, CA). The purified analog Q-CSF samples 
were prepared in 20 mM acetate and 40 mM NaQ solution buffered at pH 5.2 to a final concentration of 0. 1 mg/ml to 5 
mg/ml. depending on how the analog performed in the column. Varying amounts (depending on the concentration) were 
loaded onto the HPLC column, which had been equilibrated with an aqueous solution containing 1% isopropanol, 

IS 52.a% acetonitrile. and .38% trifluoro acetate (TFA). The samples were sidi^ected to a gradient of 0.86%/hiinute ace- 
tonitrile,and.002%TFA. 

I- Results 

20 [0128] Presented below are the results of the above biological assays and HPLC analysis. Biological activity is the 
average of triplicate data and reported as a percentage of the control standard (non-attered Q-CSF). Relative HPLC 
peak position is the position of the analog Q-CSF relative to the control standard (non-altered Q-CSF) peak. The or 
symbols indicate whether the analog HPLC peak was in advance of or followed the control standard peak (in min- 
utes). Not all of the variants had been analyzed for relative HPLC peak, and only those so analyzed are included below. 

25 Also presented are the American Type Culture Collection designations for gdi host cells containing the nucleic acMs 
coding for the present analogs, as prepared above. 
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1 , Idflntification of gtmchJre-Functjo n Reiationfihips 

[0129] The first 6tep used to design the present analogs was to determine what nooieties are necessary for struc- 
tural integrity of the Q-CSF molecule. This was done at the amino add residue level, although the atomic level is also 

5 available for analysis. Modification of the residues necessary for structural integrity results in change in the overall struc- 
ture of the G-CSF molecule. This may or may not be desirable, depending on the analog one wishes to produce. The 
working examples here were designed to maintain the overall structural integrity of the Q-CSF molecule, for the pur- 
pose of maintain Q-CSF receptor binding of the analog to the Q-CSF receptor (as used in this section below, the "G- 
CSF receptor" refers to the natural Q-CSF receptor, found on hematopoietic celts). It was assumed, and confirmed by 

10 the studies presented here, that Q-CSF receptor binding is a necessary step for at least one biological activity, as deter- 
mined by the above biological assays. 

[0130] As can be seen from the figures. Q-CSF (here, recombinant human met-Q-CSF) is an antiparallet 4-alpha 
helical bundle with a left-handed twist and with overall dimensions of 45 A x 30A x 24A. The four helices within the bun- 
dle are referred to as helices A. B, C and D. and their connecting loops are taiown as the AB. BC and CD loops. The 

IS helix crossing angles range from -167.5*" to -159.4^ Henoes A. B. and C are straight whereas helix D contains two 
kinds of structural characteristics, at Qly 150 and Ser 160 (of the recombinant human met-Q-CSF). Overall, the Q-CSF 
molecules is a bundle of four helices, connected in series by external loops. This structural Information was then corre- 
lated with known functional information. It was known that resklues (including methionine at position 1) 47. 23, 24. 20, 
21 , 44. 53. 1 13. 1 10, 28 and 1 14 may be modified, and the effect on biological activity would be substantial. 

£0 [0131] The mcgorlty of single mutattons wNch lowered biok)gical activity were centered around two regions of Q- 
CSF that are separated by 30A, and are located on different faces of the four helix bundla One region involves interac- 
tions between the A helix and the D helix. This Is further oonfinned by the presence of salt bridges In the non-altered 
molecule as fbllows: 

25 



Atom 


Helix 


Atom 


Helix 


Distance 


Arg170N1 


D 


Tyr166 0H 


A 


3.3 


Tyr1660H 


D 


Arg23N2 


A 


3.3 


QIU163 0E1 


D 


Arg23N1 


A 


2.8 


Arg23N1 


A 


Gin 26 DEI 


A 


3.1 


Qln159NE2 


D 


an 260 


A 


3.3 



[0132] Distances reported here were for nwlecule A, as indicated in FIGURE 5 (wherein three G-CSF molecules 
crystallized together and were designated as A. B, and C). As can be seen, there is a web of salt bridges between helix 
A and helix D. which act to stabilize the helix A structure, and therefore affect the overall structure of the Q-CSF mde- 

40 cule. 

[0133] The area centering around resklues Glu 20, Arg 23 and Lys 24 are found on the hydrophilic face of the A 
helix (residues 20-37). Substitution of the residues with the non-charged alanine residue at positions 20 and 23 resulted 
in simitar HPLC retention times, indicating similarity in structure. Alteration of these sites altered the biological activity 
(as indicated by the present assays). Substitution at Lys 24 altered biological activity, but did not result in a similar HPLC 

4S retention time as the other two alterations. 

[0134] The second site at which alteration lowered biological activity involves the AB helix. Changing glutamine at 
position 47 to alanine (analog no. 19, above) reduced biological activity (in the thymidine uptake assay) to zero. The AB 
helix is predominantly hydrophobic, except at the amino and cartx5xy termini; it contains one turn of a 3^^ helix. There 
are two histadines at each termini (His 44 and His 56) and an additional glutamate at residue 46 which has the potential 

so to form a salt bridge to His 44. The fburier transformed infra red spectrographic analysis (FTIR) of the analog suggests 
this analog is structurally similar to the non-altered recombinant Q-CSF molecule. Further testing showed that this ana- 
tog would not crystallize under the same conditions as the non-altered recombinant molecule. 
[0135] Alterations at the carboxy terminus (Gin 174. Arg 167 and Arg 170) had little effect on biological activity. In 
contrast, deletion of the last eight residues (167-175) lowered biological activity. These results may indicate that the 

55 deletion destabilizes the overall structure which prevents the mutant from proper binding to the Q-CSF receptor (and 
thus Initiating signal transduction). 

[01 36] Generally for the G-CSF internal core -■ tha internal four helix bundle lacking the external loops -the hydro- 
phobic internal residues are essential for structural integrity. For example, in helix A. the internal hydrophobic residues 
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are (with methionine being position 1) Phe 14, Cys 18, Vai 22. lie 25. lie 32 and Leu 36. Qenerally, for the G-CSF Inter- 
nal core the internal four helix bundle lacking the external loope -the hydrophobic internal residues are essential for 
structural integrity. For example, in helix A, the internal hydrophobic residues are (with methionine being position 1 as 
in FIGURE 1) Phe 14, Cye 18. Val 22, He 25. Oe 32 and Leu 36. The other hydrophobic residues (again with the met at 
5 position 1) are: helix B. Ala 72. Leu 76, Leu 79. Leu 83. Tyr 86. Uu 90 Leu 93; helix C. Leu 104, Leu 107. VSal 1 11. Ala 
1 14. tie 118. Met 122; and helix D. Val 154. Val 158. Phe 161. Vat 164. Vbt 168. Leu 172. 

[01 37] The above biological activity data, from the presently prepared Q-CSF analogs, demonstrate that nxxlif ica- 
tion of the external loops interfere least with Q-CSF overall structure. Preferred loops for analog prepration are the AB 
loop and the CD loop. The loops are relatively flexible structures as conpared to the helices. The loops may contribute 

10 to the proteolysis of the molecule. Q-CSF is relatively fast acting ia msi as the purpose the molecule serves is to gen- 
erate a response to a biological challenge, i.e., selectively stinudate neutrophils. The Q-CSF turnover rate is also rela- 
tively fast The flexibility of the loops may provide a "handle" for proteases to attach to the nrx>lecule to inactivate the 
molecule. Modification of the loops to prevent protease degradation, yet have (via retention of the overall structure of 
non-modified Q-CSF) no loss in biological activity may be accomplished. 

IS [0138] This phenomenon is probably not limited to the Q-CSF molecule but may also be common to the other mol- 
ecules with known similar overall structures, as presented In Rgure Z Alteration of the external loop of. for example 
hGH. Interferon B, IL-2. QM-CSF and IL-4 may provide the least change to the overall structure. The external loops on 
the GM-CSF molecule are not as flexft)le as those found on the Q-CSF nwlecule, and this may indicate a longer serum 
life, consistent with the broader biological activity of QM-CSF. Thus, the external loops of QM-CSF may be modified by 

20 releasing the external loops from the beta-sheet structure, which may make the kxips nme f lexble (similar to those Q- 
CSF) and therefore make the molecule more susceptible to protease degradation (and thus increase the turnover rate). 
[0139] Alteration of these external loops may be effected k>y stabilizing the loops by connection to one or more of 
the internal helices. Connecting means are known to those in the art. such as the formation of a beta sheet, salt bndge, 
disulf ide bonding or hydrophobic interactk>ns. and other means are available. Also, deletion of one or more moieties. 

2s such as one or more amino acid residues or portions thereof, to prepare an abbreviated molecule and thus eliminate 
. certain portions of the extemal loops may be effected. 

[0140] Thus, by alteration of the external kxsps. preferably the AB loop (amino adds 58-72 of r-hu-met Q-CSF) or 
the CD loop (amino acids 1 19 to 145 of r-hu-met<3-CSF). and less preferably the amino terminus (amino adds 1-10), 
one may therefore modify the biological function without elimination of Q-CSF receptor binding. Fa example, one may: 

30 (1 ) increase half-life (or prepare an oral dosage fam. for example) of tiie Q-CSF molecule by, for example, decreasing 
the ability of proteases to act on the Q-CSF molecule or adding chemical modifications to the G-CSF molecule, such 
as one r more polyethylene glycol molecules or enteric coatings for oral formulation which would act to change some 
characteristic of the Q-CSF molecule as described above, such as increasing serum or ottier half-life or decreasing 
antigenicity; (2) prepare a hyl>rkl molecule, such as oonf«>ining Q-CSF witti part or all of another protein such as another 

35 cytokine or another protein which effects signal transduction via entry through the cell through a Q-CSF receptor trans- 
port mechanism; or (3) increase the biological activity as in, for example, the ability to selectively stimulate neutrophils 
(as compared to a non-nrxxlified Q-CSF nrolecule). This list is not limited to the above exemplars. 
[0141] Another aspect observed from the above data is ttiat stabilizing surface interactions may affect biological 
activity This is apparent from comparing anabgs 23 and 40. Analog 23 contains a sut)8titution of the charged asparag- 

40 ine residue at position 28 for the neutrally-charged alanine resklue in that position, and such substitution resulted In a 
50% Increase in the biological activity (as measured by ttie disdosed thyniidine uptake assays). The asparagine resi- 
due at position 28 has a surface interaction with the asparagine resMue at position 1 1 3; both residues being negatively 
charged, ttiere is a certain anx)unt of instability (due to the repelling of like charged nfx>ieties). When, however the 
asparagine at position 113 is replaced with the neutrally-charged alanine, tiie biological activity drops to zero (in the 

45 present assay system). This indicates that the asparagine at position 1 13 is critical to biological activity, and elimination 
of the asparagine at position 28 serves to increase the effect that asparagine at position 1 13 possesses. 
[0142] The domains required for Q-CSF receptor binding were also determined based on the above analogs pre- 
pared and the G-CSF structure. The 0-CSF receptor binding domain is located at residues (with methionine being posi- 
tion 1) 11-57 (between the A and AB helix) and 100-118 (between the B and C helices). One may also prepare 

50 abbreviated mdecules capable of binding to a Q-CSF receptor and initiate signal transduction for selectively stimulating 
neutrophils by changing the external loop structure and having the receptor binding domains remain intact 
[0143] Residues essential for bfological activity and presumably Q-CSF receptor binding a signal transduction 
have been identified. Two distinct sites are located on two different regions of the secondary structure. What Is here 
called "Site A" is located on a helix which is constrained by salt bridge contacts between two otiier members of the hel- 

55 icat bundle. The second site. "Site B" is located on a relatively more flexible helix, AB. The AB helix is potentially more 
sensitive to local pH changes because of the type and position of the residues at ttie cart>oxy and amino termini. The 
functional importance of this flexible helix may be important in a conformationally induced fit when binding to ttte Q-CSF 
receptor. Additionally, ttie extended portion of ttie D helix is also indk»ted to be a Q-CSF receptor binding domain, as 
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ascertained by direct mutational and indirect comparative protein structure analysis. Deletion of the carboxy terminal 
end of r-hu-met-Q-CSF reduces activity as it does for hQH. fifi& Cunningham and Wells. Science 244: 1081-1084 
(1989). Cytokines which have similar structures, such as IL-6 and GM-CSF with predicted similar topology also center 
their biological activity along the cait)oxy end of the D helix, fififi Bazan. Immunology Today U: 350-354 (1990) 

5 [0144] A conparison of the structures and the positions of Q-CSF receptor binding determinants between Q-CSF 
and hGH suggests both molecules have similar means of signal transduction. Two s^>arate Q-CSF receptor binding 
sites have been ident'ified for hQH De Vos et al.. Science 2S: 306-32 (1991). One of these binding sites (called "Site 
n fbnned by residues on the exposed faces of hQH's helix 1 . the connection region between helix 1 and 2. and helix 
4. The second binding site (called "Site 11") is formed by surface residues of helix 1 and helix 3. 

10 [0145] The G-CSF receptor binding determinates Identif led for Q-CSF are located In the same relative positions as 
those identified for hQH. The Q-CSF receptor binding site located in the connecting region between helix A and B on 
the AB helix (Site A) is similar in position to that reported for a small piece of helix (residues 38-47) of hQH. A single 
point mutation in the AB helix of Q-CSF significantly reduces biological activity (as ascertained in the present assays), 
indicating the role in a Q-CSF recepta-ligand interface. Binding of the Q-CSF recepta may destabilize the 3^° helical 

IS nature of this region and induce a conformation change improving the binding energy of the Dgand/Q-CSF receptor 
complex. 

[0146] In the hQH receptor complex, the first helix of the bundle donates residues to both of the binding sites 
required to dimerize the hQH receptor Mutational analysis of the con'esponding heTix of Q-CSF (helix A) has identified 
three residues which are required for biological activity. Of these three residues. Qlu 20 and Arg 24 lie on one face of 

20 the helical bundle towards helix C. whereas the side chain of Arg 23 (in two of the three molecules in the asymmetric 
unit) points to the face of the bundle towards helix D. The position of side chains of these biologically important residues 
indicates that similar to hQH. Q-CSF may have a second Q-CSF receptor binding site along the interface between helix 
A and helix C. In contrast with the hQH molecule, the amino terminus of Q-CSF has a lintited biological role as deletion 
of the first 1 1 residues has little effect on the biological activity. 

25 [0147] As indicated above (gfifi FIQURE 2. for example). Q-CSF has a topological similarity with other cytokines. A 
correlation of the structure with previous biochemical studies, mutational analysis and direct comparison of specific res- 
idues of the hQH receptor complex indicates that Q-CSF has two receptor binding sites. Site A lies along the interface 
of the A and D helices and Includes residues in the small AB helix. Site B also includes residues in the A helix but lies 
along the Interfece between helices A and C. The conservation of structure and relative positions of biologically impor- 

30 tant residues between G-CSF and hQH is one indication of a comnwn method of signal transduction in that the receptor 
is bound in two places. It is therefore found that Q-CSF analogs possessing altered Q-CSF receptor binding domains 
may be prepared by alteration at either of the Q-CSF receptor binding sites (residues 20-57 and 145-175). 
[0148] Knowledge of the three dimensional structure and conrelation of the composition of Q-CSF protein makes 
possible a systematic, rational method for preparing Q-CSF analogs. The above worWng examples have demonstrated 

35 that the limitations of the size and polarity of the side chains within the core of the structure dictate how much change 
the molecule can tolerate before the overall structure is changed. 
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SBQUSNCB LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Amgen Inc. 

(ii) TITLE OF INVENTION: G-CSF ANALOG COMPOSITIONS AND 

METHODS 



(ill) NUMBER OF SEQUENCES: 1X0 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Amgen Inc. ^ 

(B) STREET: Amgen Center, 1B40 DeHavilland Drive 
<C) CITY: Thousand Oaks 

(D) STATE: California 

(E) COUNTRY: United States of America 

(F) ZIP: 91320-1789 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(2) INFORMATION FOR SBQ ID N0:1: 

^ (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 565 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

30 (ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 30.. 554 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

TCTAGAAAAA ACCAAGGAGG TAATAAATA ATG ACT CCA TTA GGT CCT GCT TCT 53 

Met Thr Pro Leu Gly Pro Ala Ser 
1 5 



TCT CTC CCG CAA AGC TTT CTG CTG AAA TGT CTG GAA CAG GTT CGT AAA 101 
Ser Leu Pro Gin Ser Phe Leu Leu Lys Cys Leu Glu Gin Val Arg Lys 
10 15 20 

ATC CAG GGT GAC GGT GCT GCA CTG CAA GAA AAA CTG TGC GCT ACT TAC 149 
He Gin Gly Asp Gly Ala Ala Leu Gin Glu Lys Leu Cys Ala Thr Tyr 
25 30 35 40 
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AAA CTG TGC CAT CG6 GAA GAG CTO 6TA CTG CTG GGT CAT TCT CIT GG6 197 
Lys Leu Cys His Pro 61u Glu Leu Val Leu Leu Gly His Ser Leu Gly 
45 50 55 

ATC COG TGG GCT CC6 CTG TCT TCT TGT CCA TCT CAA GCT CTT CAG CTG 245 
He Pro Trp Ala Pro Leu Ser Ser Cys Pro Ser Gin Ala Leu Gin Leu 
60 65 70 

GCT GGT TGT CTG TCT CAA CTG CAT TCT GGT CTG TTC CTG TAT CAG GGT 293 
Ala Gly Cys Leu Ser Gin Leu His Ser Gly Leu Phe Leu Tyr Gin Gly 
75 80 85 

CTT CTG CAA GCT CTG GAA GGT ATC TCT CCG GAA CTG GGT CCG ACT CTG 341 
Leu Leu Gin Ala Leu Glu Gly He Ser Pro Glu Leu Gly Pro Thr Leu 
90 95 100 

GAC ACT CTG CAG CTA GAT GXA GCT QAC TIT OCT ACT ACT ATT TGG CAA 389 
; o Thr Leu Gin Leu Asp Val Ala Asp Phe Ala Thr Thr He Trp Gin 

:ts,S 110 115 120 

CAG ATG GAA GAG CTC GGT ATG GCA CCA GCT CTG CAA CCG ACT CAA GGT 437 
Gin Met Glu Glu Leu Gly Met Ala Pro Ala Leu Gin Pro Thr Gin Gly 
125 130 135 

GCT ATG CCG GCA TTC GCT TCT GCA TTC CAG CGT CGT GCA GGA GCT GTA 485 
Ala Met Pro Ala Phe Ala Ser Ala Phe Gin Arg Arg Ala Gly Gly Val 
140 145 150 

^ CTG GTT GCT TCT CAT CTG CAA TCT TTC CTG GAA CTA TCT TAC CGT GTT 533 

Leu Val Ala Ser His Leu Gin Ser Phe Leu Glu Val Ser Tyr Arg Val 
155 160 165 

CTG CGT CAT CTG GCT CAG CCG TAATAGAATT C 565 
Leu Arg His Leu Ala Gin Pro 

30 170 175 

x^) INFORMATION FOR SBQ ID N0:2: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) X«ENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:2: 

Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
15 10 15 

Lys Cys Leu Glu Gin Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu 
45 20 25 30 

Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 
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Val Leu Leu Gly His Ser Leu Gly lie Pro Trp Ala Pro Leu Ser Ser 

50 55 60 

Cys Pro Ser Gin Ala Z<eu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly lie 
85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Net Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

P.«e Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

' Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
^ 165 170 175 

(2) INFORMATION FOR SBQ ID NO: 3: 

(i) SBQUENCB CHARACTBRISTICS : 
2S (A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNESS : single 

(D) TOPOLOGY: linear 
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(ii) MOLBCOLE TYPB: DNA 
(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 3: 
CaXTCTGCTG cgttgtctgg aaca 

55 (2) INFORMATION FOR SBQ ID N0:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPB: nucleic acid 

(C) STRANDBDNESS: single 
40 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPB: DNA 

<xi} SEQUENCE DESCRIPTION: SBQ ID N0:4: 
^ ACAGGTTCGT GGTATCCAGG GTG 
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(2) INFORMATION FOR SBQ ID NO: 5: 

(i) SEQUBNCE CHARACTERISTICS: 

(A) LBNGTB: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNBSS: Single 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5 

CACTGCAAGA AOGTCTGTGC GCT 



(2) INFORMATION FOR SBQ ID N0:6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6 
C6CTACTTAC CGTCT8TGCC ATC 



(2) INFORMATION FOR SEQ ID N0:7: 

(i) SEQUBNCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNBSS: Single 

(D) TOPOLOGY: linear 

(ii) MOLBCULE TYPB: DNA 
(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 7 
CTTTCTGCTG C6TTGTCTGG AACA 



(2) INFORMATION FOR SEQ ID NO:e: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNBSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(xi) SSQUEKCB DBSCRZPTION: SBQ ID NO: 8: 
ACA66TTCGT CGTATCCAG6 GIG 



(2) INFORMATION FOR SBQ ID NO: 9: 

(i) SBQUBNCE CHARACTBRISTICS : 

(A) LENGTH: 23 base pairs 

(B) TYPB: nucleic acid 

(C) STRANDBDNBSS : single 

(D) TOPOLOGY: linear 

(ii) MOLBOTLE TYPB: DNA 
(xi) SEQUENCE DBSCRIPTION: SBQ ID N0:9: 
r-CTGCAAGA ACX5TCTGTGC GCT 



(2) INFORMATION FOR SBQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNBSS: Single 

(D) TOPOLOGY: linear 

(ii) MOLBCULB TYPB: DNA 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 10 
CTTTCTGCT6 OGTTGTCTQG AACA 



(2) INFORMATION FOR SBQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNBSS: single 

(D) TOPOLOGY: linear 

(ii) MOLBCULB TYPE: DNA 
(xi) SBQUBNCE DBSCRIPTION: SBQ ID NO: 11 
ACAGGTTOGT GGTATCCAGG GTG 



(2) INFORMATION FOR SBQ ID NO: 12: 

(i) SEQUENCE CHARACTBRISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPB: nucleic acid 

(C) STRANDBDNBSS: single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TVPB: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:12: 
CGCTACTTAC CGTCTGTCCC ATC 



(2) INFORMATION FOR SBQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDSDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SSQ ID NO: 13: 
CTTTCTGCTG CGTTGTCTGG AACA 



(2) INFORMATION FOR SEQ ID N0:14: 

(i) SEQUENCE CHAR ACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNESS : Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SSQ ID NO: 14: 
^ tTGCAACaA AOGTCTGTGC GCT 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SBQ ID N0:1S: 
CGCTACTTAC OGTCTGTGCC ATC 
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(2) ZNF0RM21TI0N FOR 8BQ ID NO: 16: 

(1) SEQUENCB CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 16 

ACAGGTTCGT CGTATCCAGG GTG 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17 

CACTGCAAGA ACGTCHGTGC GCT 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18 
CGCTACTTAC CGTCTGTGCC ATC 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



EP 0612 846 B1 



(xi) SBODBNCR DBSCStlPTION : SBQ ZD NO: 19: 



Cmr X XICTG OGTTGTCTQG AACA 



24 



5 



(2) INFORMATION FOR SBQ ID N0:20: 

(i) SBQDBNCB CBAR ACTKRISTICS ; 

(A) LBNGTB: 23 hase pairs 

(B) TYPB: nucleic acid 

(C) 8TRANDBDNBSS : Single 

(D) TOPOLOGY: linear 

(ii) MOLBCULB TYPB: DNA 

(xi) SBQUBNCB DBSCRIPTION: SBQ ID NO: 20: 
' "VjGTTOGT 0GTATCCAG6 6TG 23 



(2) INFORMATION FOR SBQ ID NO: 21: 

(i) SBQUBNCB CHAR ACTBRISTICS : 

(A) LBN6TO: 23 base pairs 

(B) TYPB: nucleic acid 

(C) STRANDBDNBSS : single 

(D) TOPOIiOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
CACTGCAAGA ACGTCTCTGC GCT 23 



30 



(2) INFORMATION FOR SBQ ID N0:22: 



3S 



ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPB: nucleic acid 

(C) STRANDBDNBSS: single 

(D) TOPOLOGY: linear 



(ii) MOLBCULB TYPB: DNA 



40 



(xi) SEQUENCE DBSCRIPTION: SBQ ID NO: 22: 



CGCTACTTAC CGTCTGTGCC ATC 



23 



(2) INFORMATION FOR SBQ ID NO:23: 



4S 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPB: nucleic acid 

(C) STRANDBDNBSS: single 
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(D) TOPOLOGY: linear 
(ii) MOLBCULB TYPE: DMA 

(xi) SBQX7BNCS DBSC31XPTI0N: SEQ ID NO: 23: 
TCT6CTGAAA GCTCTGGAAC AGO 23 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPB: DNA . 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
CTTGTCCATC TGAAGCTCTT CAG 23 

(2) INFORMATION FOR SEQ ID NO:25: 

(i) SEQUENCE CHARACTERISTICS: 
2s (A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS : single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
r-JUUJlCTGT COGCrTACTTA CAAACTGTCC CATCCGG 37 

35 (2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 ^base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS: single 
40 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
45 TTCGTAAAAT CGCGGGTGAC GG 22 
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(2) INFORMATION FOR SBQ ID N0:27: 

(i) SBQUBNCB CHARACTERISTICS: 

(A) LB3IGTH: 22 base pairs 

(B) TYPB: nucleic acid 

(C) STRANDEDNBSS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPB: DNA 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 27: 
TCATCTGGCT 00GC06TAAT AG 



(2) INFORMATION FOR SBQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPB: nucleic acid 

(C) STRANDBDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPB: DNA 

(xi) SBQUBNCB DESCRIPTION: SBQ ID NO:28: 
COGTGTTCTG GCTCATCTGG CT 



(2) INFORMATION FOR SBQ ID NO:29: 

(i) SBQUBNCB CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPB: nucleic acid 

(C) STRANDBDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPB: DNA 

(xi) SEQUENC:E DESCRIPTION: SBQ ID NO: 29: 
GAAGTATCTT ACGCTGTTCT (30GT 



(2) INFORMATION FOR SBQ ID NO:30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPB: nucleic acid 

(C) STRANDBDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPB: DNA 
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(xi) 8EQUEKCB DBSCRIPTIOM: SBQ ZD N0^30: 
GAAGTATCTT ACTAA6TTCT GOGTC 



(2) INFORMATION FOR SBQ ID NO: 31: 

(1) SBQUBNCB CHARACTBRISTICS : 

(A) LBNGTB: 22 base pairs 

(B) TYPB: nucleic acid 

(C) STRANDBDNBSS: single 

(D) TOPOIiOGY: linear 

(ii) MOLBCULB TYFB: DNA 

(xi) SBQUBNCB DBSCRIPTION: SBQ ID N0:31: 
--iCTACnTAC GCACTGTGCC AT 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQXJENCE CHARACTBRISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNBSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: 
CAAACTGTGC AAGCCGGAAG AG 



(2) INFORMATION FOR SBQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNBSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 33: 
CATCCGGAAG CACT6GTACT GC 



(2) INFORMATION FOR SBQ ID NO: 34: 

(i) SEQUENCE CHARACTTERISTICS : 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNBSS: single 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DKA 

(xi) SBQtTBNCB DBSC31XPTI0N: SBQ ID KO:34: 
GGAACAGGTT GCTAAAATCC A66 



(2) INFOHN^ITZON FOR SBQ ID MO:35: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNBSS : single 

(D) TOPOLOGY: linear 

(ii) MOLBCULE TYPE: DNA 

(xi) SEQUBNCB DESCRIPTION; SBQ ID NO: 35: 
GAACA06TTC GTGOQATCCA GGGT6 



(2) INFORMATION FOR SBQ ID NO:36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNBSS: Single 

(D) TOPOLOGY: linear 

(ii) MOLBCULE TYPE: DNA 

(xi) SBQUENCE DESCRIPTION: SBQ ID NO: 36: 
AAT6TCTG (3CACAG(3TTC 6T 



(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUBNCB CHARACTERISTICS: 

(A) LENGTH: 19 ^base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNBSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUBNCB DESCRIPTION: SBQ ID NO: 37: 
TCCAGGGTGC C(3GTOCTGC 
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(2) INFORMATION FOR SBQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS : Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
AAGA6CT0GG TGAGGCACCA GCT 



(2) INFORMATION FOR SEQ ID NO:39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
CTCAAGGTGC TGAGCOGGCA TTC 



(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
GA6CTCGGTC TGGCACCAGC 



(2) INFORMATION FOR SBQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(xl) SEQUENCE DESCRIPTION: SBQ ID NO: 41 
TCAAG6T6CT CTGCCGGCAT T 



(2) INF0RM21TI0N FOR SBQ ID NO:42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) T^TFB: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TVPB: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42 
- TGCCOCAA OCCTTTCTQC TGA 



(2) INFORMATION FOR SEQ ID N0:43: 

(i) SBQX7ENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43 
CmntSCTGr GCATGTCTGG AACA 



(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44 
CTAnTGGCA A6CGATGGAA GAGC 



(2) INFORMATION FOR SEQ ID NO:45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOIiOGY: linear 
(11) MOliBCULB I7PB: ONA 

(xl) SEQUSNCB DESCRIPTION: SBQ ID MO: 45: 
CA6ATGGAAG 06CTCGGTAT 6 2^ 

(2) INFORMATION FOR SBQ ID NO: 46: 

(1) SBQUENCB CKARACTBRISTICS : 

(A) LEN6TB: 20 base pairs 

(B) TYPB: nucleic acid 

(C) STRANDBDNBSS: single 
(O) TOPOLOGY: linear 

(11) MOLBCULB TYPB: DNA 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 46: 
GAGCTCGGTC TGGCACCAGC 20 

(2) INFORMATION FOR SBQ ID NO:47: 

(1) SBQUENCB CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLBCULB TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SBQ ID N0:47: 
•^^^AGGTGCT CTGCCGGCAT T 21 

^ (2) INFORMATION FOR SBQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNBSS: single 
CD) TOPOLOGY: linear 
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(11) MOLBCULB TYPB: DNA 

(Xi) SBQUENCB DESCRIPTION: SBQ ID NO: 48: 
GAAATGTCTG 6CACAGGTTC GT 22 



SO 
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(2) INFORMATION FOR SBQ ID NO: 49: 

(i) SBQUBNCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:49: 
TTCCGGAGCG CACAGTTTG 



IS (2) INFORMATION FOR SBQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS: single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 
ZS CGAGAAGGCC TCGGGTGTCA AAC 



(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECrULB TYPE: DNA 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:51: 
ATGCCAAATT GCAGTAGCAA AG * 



40 (2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS: single 
45 (D) TOPOLOGY: linear 

(ii) MOLECmE TYPE: DNA 



SO 
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(xi) SEQOBKCB OBSCRIPTION: SBQ ID NO:52: 
ACAACGGTTT AAOOTCATOS TTTC 24 

(2) INFORmTIQN FOR SBQ ZD NO:53: 

(i) SBQUBKCB CHARACTBRZSTICS : 

(A) LENGTH: 22 base pairs 

(B) TYPB: xiucleic acid 

(C) STRANDBWBSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO:53: 
/-'CAGCTACT GCTAGCIGCA GA 22 

(2) INFORMATION FOR SBQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LEN6TB: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 54: . 
TCAGTCGATG ACGATCXSACX3 TCT 23 

(2) INFORMATION FOR SBQ ID NO: 55: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 
35 (B) TYPB: nucleic acid 

(C) STRANDBDNBSS : Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
40 (xi) SBQX7ENCB DESC31IPTI0N : SBQ ID NO: 55: 

TTACGAACOG CTTCCAGACA TT 22 



(2) INFORMATION FOR SBQ ID NO: 56: 

(i) SEQUENCE C3IARACTERISTZCS : 

(A) LENGTH: 25 base pairs 

(B) TYPB: nucleic acid 

(C) STRANDBDNBSS: Single 



55 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: 6EQ ID NO: 56 
TAAAAT6CTT GGCGAAG6TC T6TAA 



(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 baee pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57 
6TAGCAAATG CAGCTACATC TA 



(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58 
TCATCGIT TACGTCGAT6 TAGAT 



(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE C3IARACTERISTICS : 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59 



CCAAGAGAAG CZACCCAGCAG 
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(2) INFORMATION FOR SBQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 
AGGGTTCTCT TCGTGGGTOG TC 22 

(2) INFORMATION FOR SEQ ID N0:61: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:61: 
CACT6GC66T GATAAT6AGC 20 

(2) INFORMATION FOR SBQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
^ (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

CTTAGGCCAGG CATTACTGG i9 



(2) INFORMATION FOR SEQ ID NO: 63: 

40 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

45 

(ii) MOLECULE TYPE: DNA 



SO 



55 



EP0 612 846B1 



(xi) SBQUBKCE DB5CRIPTI0N: SBQ ZD NO: 63: 
CCACTGGCX^G TGATACTGAG C 



(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUBNCB CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) T»E: nucleic acid 

(C) STRANDBDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
AGCAGAAAGC TTTCCGGCAG A6AAGAA6CA GGA 



(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDBDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 
GCCGCAAAGC TTTCTGCT6A AATGTCTGGA AGAGGTTCGT AAAATCCAGG GTGA 



(2) INFORMATION FOR SBQ ID NO: 66: 

(i) SEQUENCE (3ARACTERISTICS : 

(A) LENGTH: 59 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNESS: Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
CTGGAATGCA GAAGCAAAT6 CCGGCATAGC ACCTTCAGTC GGTTGCAGAG CTGGTGCCA 



(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQXHNCE CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLBCTLB TYPE: protein 

(xi) SBQUENCB DESCRIPTION: SBQ ID NO: 67: 

Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
1 5 10 15 

Arg Cys Leu Glu Gin Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu 

20 25 30 

Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 

Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 

Cvs Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
is 70 75 80 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
.85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 

(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
15 10 15 

Lys Cys Leu Glu Gin Val Arg Arg He Gin Gly Asp Gly Ala Ala Leu 
20 25 30 
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Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 *5 

Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
^5 70 75 80 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 

Ser Pro Glu Leu Gly Pro Thx Leu Asp Thr Leu Gin Leu Asp Val Ala 

100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
lis 120 125 

I ■ Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 i35 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 

(2) INFORMATION FOR SBQ ID NO: 69: 

(i) SBQUBNCB CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLBCULB TYPE: protein 

^ (Xi) SEQUENCE DESCRIPTION: SBQ ID NO: 69: 

Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
15 10 15 



Lys Cys Leu Glu Gin Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu 
20 25 30 

Gin Glu Arg Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 .40 45 

4S Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 

50 55 60 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 



so 



Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 



ss 
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Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 



(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TITPB: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:70: 

Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 

1 5 ..10 15 

Lys Cys Leu Glu Gin Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu 
20 25 30 

Gin Glu Lys Leu Cys Ala Thr Tyr Arg Leu Cys His Pro Glu Glu Leu 
35 40 45 

^'-«1 Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val- Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 
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Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 

(2) INFORMATION FOR SEQ ID N0:71: 

(i) SEQUHNCE CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 

'5 Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 

15 10 15 

/ Cys Leu Glu Gin Val Arg Arg He Gin Gly Asp Gly Ala Ala Leu 
20 25 30 

^ Gin Glu Arg Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 

35 40 45 

Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly lie 
85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
40 ' 150 ' 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 



25 
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(2) INFORMATION FOR SEQ ID NO:72: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
^0 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:72: 

Met Thr Pro Leu 61y Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
15 10 15 

Arg Cys Leu 61u Gin Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu 

20 25 30 

Gin Glu Arg Leu Cys Ala Thr Tyr Arg Leu Cys His Pro Glu Glu Leu 
35 40 45 

Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 

65 70 75 80 

S"- Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro thr Gin Gly Ala Net Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
30 145 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 
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*2) INFORMATION FOR SEQ ID NO: 73; 



(i) SSQX7ENCE CHARA CTERISTICS; 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
40 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 



Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
1 .5 10 15 

Lys Cys Leu Glu Gin Val Arg Arg He Gin Gly Asp Gly Ala Ala Leu 
20 25 30 

Gin Glu Arg Leu Cys Ala Thr Tyr Arg Leu Cys His Pro Glu Glu Leu 
35 40 45 
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Val I-eu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 .70 75 80 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser Hie Leu Gin Ser 
145 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 

25 (2) INFORMATION FOR SEQ ID N0:74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 
(6) TYPE: amino acid 
(D) TOPOLOGY: linear 

30 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

r *: Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
15 10 15 

Arg Cys Leu Glu Gin Val Arg Arg He Gin Gly Asp Gly Ala Ala Leu 
20 25 30 

<o Gin Glu Arg Leu Cys Ala Thr Tyr Arg Leu Cys His Pro Glu Glu Leu 

35 40 45 

Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 



35 



45 



50 



Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 



55 



10 



IS 



25 



35 



40 



45 



50 



EP 0 612 846 B1 



Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 

130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 



(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SBQUENCB CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

^ (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 



Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
1 5 10 15 

Arg Cys Leu Glu Gin Val Arg Arg He Gin Gly Asp Gly Ala Ala Leu 
20 25 30 

Gin Glu Lys Leu Cys Ala Thr Tyr Arg Leu Cys His Pro Glu Glu Leu 
30 35 40 45 

Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 



r 's Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
.5 70 75 80 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 
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(2) INFORMATION FOR SBQ ID N0:76: 

(i) SBQUBNCB CHA2UICTBRISTICS : 

(A) LENGTH: 175 amino acids 

(B) TVPB: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 

15 10 15 

Lys Cys Leu Glu Gin Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu 
20 25 30 

Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 

Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 

Cys Pro Ser Glu Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

I. .3 Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 



(2) INFORMATION FOR SBQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:77: 



EP0612 846B1 



Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
1 5 10 15 

Lys Cys Leu Glu Gin Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu 
20 25 30 

Gin Glu Lys Leu Ser Ala Thr Tyr Lys Leu Ser His Pro Glu Glu Leu 
35 40 45 

Val Leu Leu Gly His Ser Leu Gly He Pro Tzp Ala Pro Leu Ser Ser 
50 55 60 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 

65 .70 75 80 

Ser Gly Leu Phe Leu lyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
65 90 95 

. r Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 

130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
30 165 170 175 

(2) IKFORMATIOK FOR SEQ ID NO: 78: 



10 



IS 



20 



2S 



35 



40 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 



Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 

15 10 15 

^ Lys Cys Leu Glu Gin Val Arg Lys He Ala Gly Asp Gly Ala Ala Leu 

20 25 30 

Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 

35 40 45 



SO 



Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 



£5 
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10 



IS 



20 



Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He. 

85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala. Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

, Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 

(2) INFORMATION FOR SEQ ID NOt79: 

(i) SEQUENCE CHARACTERISTICS: 
2s (A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

so (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
1 5 10 15 

„ '-?s Cys Leu Glu Gin Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu 

20 25 30 

Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 ^ 

^ Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 

50 55 60 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 



45 



SO 



Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
B5 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 



55 
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25 



30 



35 



40 
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Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg Kle Leu Ala Ala Pro 
165 170 175 



(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



<ii) MOLECULE TYPE: protein 

so (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
15 10 15 



Lys Cys Leu Glu Gin Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu 
20 25 30 

Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 

Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 

''-^r Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 

115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

so Phe Leu Glu Val Ser Tyr Arg Val Leu Ala His Leu Ala Gin Pro 

165 170 175 



55 
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(2) INFOHMATION FOR SSQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
1 5 10 15 

Lys Cys Leu Glu Gin Val Arg Lya He Gin Gly Asp Gly Ala Ala Leu 
20 25 30 

Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 

Val Leu Leu Gly His Ser Lieu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 €0 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
30 100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Net Glu Glu Leu Gly Met Ala 
115 120 125 



10 



IS 



20 



2ff 



35 



Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

40 Phe Leu Glu Val Ser lyr Ala Val Leu Arg His Leu Ala Gin Pro 

165 170 175 



45 



(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 174 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

so (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 
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Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
15 10 15 

Lys Cys Leu Glu Gin Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu 
20 25 30 

Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 

Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 

w.r Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Tzp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

Phe Leu Glu Val Ser Tyr Val Leu Arg His Leu Ala Gin Pro 
30 165 170 174 

(2) INFORMATION FOR SBQ ID NO: 83: 



10 



IS 



20 



25 



35 



40 



45 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 

Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
1 5 10 15 

Lys Cys Leu Glu Gin Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu 
20 25 30 

Gin Glu Lys Leu Cys Ala Thr Tyr Ala Leu Cys His Pro Glu Glu Leu 
35 40 45 

Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 
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10 



IS 



20 



Cys* Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 

Ser Gly Leu Phe Leu lyr Gin Gly Leu Leu Gin Ala Leu Glu Gly lie 
85 90 95 

Ser Pro Glu Leu Gly Pro The Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro Thz Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

4^e Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 

(2) INFORMATION FOR SBQ ID NO: 84: 

(i) SBQUBNCB CRARACTBRISTICS : 
2s (A) liENGTO: 175 amino acids 

(B) TYPB: amino acid 
(D) TOPOLOGY: linear 

(ii) NOLBCOLE TYPE: protein 

^ (Xi) SBQUBNCB DESCRIPTION: SEQ ID NO: 84: 

Met Thr Pro Leu Gly Pro. Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
1 5 10 15 

35 ^ *s Cys Leu Glu Gin Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu 

20 25 30 

Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys Lys Pro Glu Glu Leu 
35 40 45 

^ Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 

50 55 60 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 



45 



SO 



Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 



ss 
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Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 



(2) INFORMATION FOR SBQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 

Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
15 10 15 

Lys Cys Leu Glu Gin Val Arg Lys lie Gin Gly Asp Gly Ala Ala Leu 
20 25 30 

Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Ala Leu 

35 40 45 . 

Val Leu Leu Gly His Ser Leu Gly lie Pro Trp Ala Pro Leu Ser Ser 
50 55 60 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 

S^r Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly lie 
85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140. 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 



10 
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(2) INFORMATION FOR SBQ ID NO: 86: 

(i) SBQUENCB CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 

Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
15 10 15 

IS Lys Cys Leu Glu Gin Val Ala Lys lie Gin Gly Asp Gly Ala Ala Leu 

20 25 30 

Glii Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 

20 

Val Leu Leu Gly His Ser Leu Gly lie Pro Trp Ala Pro Leu Ser Ser 
50 55 60 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly lie 
85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr lie Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

40 Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 

165 170 175 

(2) INFORMATION FOR SBQ ID NO: 87: 

^ (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

so (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 87: 
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Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
1 5 10 15 

Lys Cys Leu Glu Gin Val Arg Ala He Gin Gly Asp Gly Ala Ala Leu 
^ 20 25 30 

Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 

Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 

Cvs Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 

^.r Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
2s 130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
30 165 170 175 

(2) INFORMATION FOR SEQ ID NO: 88: 



10 



IS 



20 



3S 



40 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLBCOLE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 



Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 

15. 10 15 

^ Lys Cys Leu Ala Gin Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu 

20 • 25 30 

Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 



so 



Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 
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Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 

Ser Gly Leu Phe Leu TVr Gin Gly Leu Leu Gin Ala Leu Glu Gly Zle 
85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

Ir. J Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 

(2) INFORMATION FOR SBQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 
2s (A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
15 10 15 



10 



IS 



20 



3S 



l"^ Cys Leu Glu Gin Val Arg Lys He Gin Gly Ala Gly Ala Ala Leu 
20 25 30 

Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 

40 Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 

50 55 60 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 



45 



50 



Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Qly He 
85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 
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Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 



(2) INFORMATION FOR SBQ ID NO: 90: 

(i) SBQXJENCE CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
1 5 10 15 

Lys Cys Leu Glu Gin Val Arg Lys lie Gin Gly Asp Gly Ala Ala Leu 
20 25 30 

Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 

Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 

S'r Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Glu Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 

130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 
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(2) INFORMATION FOR SBQ ID N0:91: 

(1) 8BQUBNCB CHARACTERISTICS: 
5 (A) L8N6TB: 175 amino adds 

(B) TYPB: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 

Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Lou Leu 
1 5 10 15 

Lys Cys Leu Glu Gin Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu 
2Q 25 30 

Cn Glu Lys Leu Cys Ala Thr lyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 '45 

Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 

Ser Pro Glu Leu Gly Pro Tta Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Glu Pro Ala Phe Ala Ser Ala 
35 130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

40 Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 

165 ' 170 175 

(2) INFORMATION FOR SBQ ID NO: 92: 

^ (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

so (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 
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Met Thr Pro Leu 61y Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
15 10 15 

Lys Cys Leu 61u Gin Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu 
20 25 30 

Gin Glu Lys Leu CyB Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 

Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 

50 55 60 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 

L J Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Net Glu Glu Leu Gly Leu Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Net Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 . 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
30 165 170 175 

(2) INFORMATION FOR SEQ ID NO: 93: 
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(i) SEQUENCB CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) NOLECULS TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 

Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
15 10 15 

Lys Cys Leu Glu Gin Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu 
20 25 30 

Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 

Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 
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Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 8Q 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 55 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 ^ 3-25 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Leu Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

. .e Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 

(2) INFORMATION FOR SBQ ID HO: 94: 

(i) SBQU2NCB CHARACTBRISTICS : 
25 (A) LBNGTH: 175 amino acids 

(B) TYPE: andno acid 
(D) TOPOLOOy: linear 

(ii) MOLBCULE TYPB: protein 

(xi) SBQUENCB DESCRIPTION: SEQ ID NO: 94: 

Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
1 5 10 15 
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IS 



20 



35 



* -s Ala Leu Glu Gin Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu 
20 25 30 

Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 

^ Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 

50 55 60 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

so 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 
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Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 

145 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 

(2) INFORMATION FOR SBQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 

Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Glu Ser Phe Leu Leu 
15 10 15 

^ Lys Cys Leu. Glu Glu Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu 

20 25 30 

Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 

30 

Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 

r r Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
40 100 105 110 

Asp Phe Ala Thr Thr He Tfp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 

4s Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 



35 



50 



Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 



55 
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(2) INF0RM21TI0N FOR SEQ ID NO: 96: 

(i) SBQUBKCE CHARACTERISTICS: 

(A) LENGTH: 175 amino adds 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 

Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Glu Ser Phe Leu Leu 
15 10 15 

15 i^ys cys Leu Glu Glu Val Arg Lys lie Gin Gly Asp Gly Ala Ala Leu 

20 25 30 

r^n Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 



so 



25 



30 



40 



Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 

Cys Pro Ser Glu Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 



(2) INFORMATION FOR SEQ ID NO: 97: 

^ (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

sf, (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 
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Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Gly Phe Leu Leu 
1 5 10 15 

Lys Cys Leu Ala Gin Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu 
20 25 30 

Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 

Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 

S^' Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 



(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
15 10 15 

Lys Cys Leu Glu Gin Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu 
20 25 30 



Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 
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Val Leu Leu 61y His Ser Leu Oly lie Pro Trp Ala Pro Leu Ser Ser 
50 55 60 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu Kis 
65 70 75 80 

Ser Gly Leu Pbe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly lie 
85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr lie Tzp Gin Gin Met Glu Glu Leu Gly Leu Ala 

115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Leu Pro Ala Phe Ala Ser Ala 
130 135 140 

h^e Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 

(2) INFORMATION FOR SBQ ID NO:99: 

(i) SBQUBNCB CKARACTBRISTICS : 

(A) LBNGTH: 175 amino acids 

(B) TYPE: amino' acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 

» t Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ala Phe Leu Leu 
3S I 5 10 15 

Lys Cys Leu Glu Gin Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu 
20 25 30 



10 



16 



SO 



30 



40 



Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 

Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 

45 Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 

65 70 75 80 

Ser Gly Leu Phe Leu T/r Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 



so 



Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 
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25 



30 



35 



40 



45 



50 



Asp Phe Ala Thr Thr He Trp 61n 61n Met 61u 61u Leu Gly Net Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 



(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECOLB TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 

Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 

1 5 10 15 

Ala Cys Leu Glu Gin Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu 
20 25 30 

Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 

Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 

^ s Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
,,5 70 75 80 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 



55 
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(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SBQUBNCB CSARACTBRISTICS : 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
1-5 10 15 

Lys Cys Leu Glu Gin Val Arg Lye He Gin Gly Asp Gly Ala Ala Leu 
20 25 30 

C'n Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 

Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Ala Met Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 , 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 

(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 
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Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 

1 5 10 " 

Lys Cys Leu Glu Ala Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu 
20 25 30 

Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 

val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin I^u His 
65 70 75 80 

ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 

L r Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 

145 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 

(2) INFORMATION FOR SBQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 

Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
1 5 10 15 

Lys Cys Leu Glu Gin Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu 
20 25 30 

Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys Ala Pro Glu Glu Leu 
35 40 45 

Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 

50 . 55 60 
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Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 

130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

&.«e Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 

(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 

Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
1 5 10 15 

r e Cys Leu Glu Gin Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu 
20 25 30 

Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 

40 Val Leu Leu Gly Ala Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 

50 55 60 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 



45 



SO 



Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 



55 



EP0612 846B1 



Pro Ala Leu 61n Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 

145 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 



(2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
1 5 10 " 

Lys Cys Leu Glu Gin Val Arg Lys lie Gin Gly Asp Gly Ala Ala Leu 
20 25 30 

Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 

Val Leu Leu Gly His Ser Leu Gly lie Pro Trp Ala Pro Leu Ser Ser 
50 55 60 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 

r^v Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly lie 
85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Ala Val Ala 
100 105 110 

Asp Phe Ala Thr Thr lie Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 
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(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LE^TH: X75 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 

Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
15 10 15 

Lys Cys Leu Glu Gin Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu 
20 25 30 

C* 1 Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 

Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin. Ala Leu Glu Gly He 
85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Ala Phe Ala Thr Thr He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 



(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:107: 
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Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
1 5 10 15 

Lvs Cys Leu Glu Gin Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu 

20 25 30 ^ 

Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 

Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 

-r Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Phe Ala Thr Ala He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
30 165 170 175 

(2) INFORMATION FOR SBQ ID NO: 108: 



10 



15 



20 



25 



35 



40 



45 



SO 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECDLB TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 

Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
15 10 15 

Lys Cys Leu Glu Gin Val Arg Lys He Gin Gly Ala Gly Ala Ala Leu 
20 25 30 

Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 

Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 
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Cys Pro Ser Gin Ala lieu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly lie 
85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Ala Val Ala 
100 105 110 

Asp Phe Ala Thr Thr lie Trp Gin Gin Met Glu Glu Leu Gly Met Ala 

115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
--5 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 



(2) INFORMATION FOR SSQ ID N0:109: 

(i) SBQUENCB CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
15 10 15 

^^s Cys Leu Glu Gin Val Arg Lys lie Gin Gly Asp Gly Ala Ala Leu 
20 25 30 _ 

Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
35 40 45 

Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 

Cys Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 
65 70 75 80 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 



EP0 612 846B1 



Asp Phe Ala Thr Thr He Trp Gin Gin Met Glu Ala Leu Gly Met Ala 
115 120 125 

5 Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 

130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

^0 Phe Leu Glu Val Ser Tyr Arg Val Leu Airg His Leu Ala Gin Pro 

165 170 175 



IS 



(2) INFORMATION FOR SBQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 



25 



Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gin Ser Phe Leu Leu 
1 5 10 . 15 

Lys Cys Leu Glu Gin Val Arg Lys He Gin Gly Asp Gly Ala Ala Leu 
20 25 30 

Gin Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 
30 35 40 45 

Val Leu Leu Gly His Ser Leu Gly He Pro Trp Ala Pro Leu Ser Ser 
50 55 60 

r- *s Pro Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His 

^ ^5 70 75 80 

Ser Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly He 
85 90 95 



40 



45 



SO 



Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val Ala 
100 105 110 

Asp Val Ala Thr Ala He Trp Gin Gin Met Glu Glu Leu Gly Met Ala 
115 120 125 

Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe Ala Ser Ala 
130 135 140 

Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gin Ser 
145 150 155 160 

Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gin Pro 
165 170 175 
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Claims 

1. A method for preparing a Q-CSF analog conprising the ^ 

(a) viewing at the amino acid or atomic level information conveying the three dimensional structure of a G-CSF 
molecule as set forth in Rgure 5; 

(b) selecting from said viewed information at least one site on said Q-CSF molecule for alteration: 

(c) preparing a Q-CSF molecule having such alteration; and 

(cQ optionally, testing such Q-CSF molecule for a desired characteristic. 

2. A method for preparing a Q-CSF analog according to daim 1 based on the use of a computer comprising the steps 
of: 

(a) providing computer expression at the amino acid or atomic level of the three dimensional structure of a Q- 
CSF molecule as set forth in Figure 5; 

(b) selecting from said computer expression at least one site on said Q-CSF molecule for alteration: 

(c) preparing a Q-CSF molecule having such alteratfon; and. 

(d) optionally, testing such G-CSF molecule for a desired characteristic. 

3. A method for preparing a Q-CSF analog acoorcfing to daim 2 comprising: 

(a) providing said computer with the means for displaying the three dimensional structure of a Q-CSF molecule 
as set forth In Figure 5: Induding displaying the composition of moieties of said Q-CSF molecule, preferably 
displaying the three dimensional location of each amino add. and more preferably displaying the three dimen- 
sional location of each atom of a Q-CSF mdecule: 

(b) viewing said display: 

(c) selecting a site on said display for alteration in the oomposition of said molecule or the locatfon of a moiety; 
and 

(d) preparing a Q-CSF analog with such alteration. 

4. A computer-based method for preparing a G-CSF analog comprising the steps of: 

(a) viewing at the amino acid or atomic level the three dimensional structure of a Q-CSF molecule as set forth 
in Figure 5; via a computer, said computer having been previously programmed (0 to express the coordinates 
of a Q-CSF molecule in three dimensional space, and (ii) to allow for entry of information for alteration of said 
Q-CSF expression and viewing thereof; 

(b) selecting a site on said visual image of said Q-CSF molecule for atteration; 

(c) entering information for said alteration on said computer; 

(d) viewing a three dimensional structure of said altered Q-CSF nrxHecule via said computer; 

(e) optionally repeating steps (a)-(e) above; 

(f) preparing a Q-CSF anatog with said alteration; and 

(g) optionally testing said Q-CSF analog for a desired characteristic. 

PatentansprOche 

1 . Verfahren zur Herstellung eines Q-CSF-Analogs. welches die Schritte umfoOt: 

(a) Betrachten. auf dem Aminosfiure- Oder Atomniveau. von Information, welche die dreidimensionale Struktur 
eines Q-CSF-MolekQIs. wie angegeben in Fig. 5. vermittelt; 

(b) Auswahlen. aus besagter betrachteten Information, von wenigstens einer Stelle auf besagtem Q-CSF- 
MolekQl for eine Verfinderung; 

(c) Herstellen eines Q-CSF-MoleluUs mit einer solchen Veranderung; und 

(d) fakultativ. Testen eines solchen Q-CSF-MdekOls auf eine gewQnschte Eigenschaft. 

2. Verfahren zur Herstellung eines Q-CSF-Analogs nach Anspruch 1, auf der Basis der Venwendung eines Compu- 
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ter8. welches die Schritte umta6t: 

(a) Bereitsteiten einer Computerdarelellung, auf dem Aminosaure- Oder Atomniveau, der dreidimensionalen 
Struktur eines Q-CSF-MolekOls. wie angegeben in Fig. 5; 

(b) Auswahten, aus besagter Computerdarstellung. von wenigstens einer Stelle auf besagtem Q-CSF-MolekQI 
fOr eine VerAnderung; 

(c) Herstellen eines Q-CSF-Molekflls mit einer sOlchen Veranderung; und 

(d) fakuftativ. Testen eines soldien Q-CSF-MoleKQIs auf eine gewQnschte Etgenschaft 

3. Verfahren zur Hersteilung eines Q-CSF-Analogs nach Anspruch 2, welches umfadt: 

(a) Versehen besagten Computers mit Mittein zum Anzeigen der dreidimensionalen Struktur eines Q-CSF- 
MolekQIs. wie angegeben in Rg. 5, einschlieBlich Anzeigen der Zusammensetzung der Bnheiten besagten 0- 
CSF-MolekDIs. vorzugsweise Anzeigen der dreidimensionalen Anordnung jeder AminosAure und bevorzugter 
Anzeigen der dreidimensionalen Anordnung jedes Atoms eines Q-CSF-MolekOls; 

(b) Betrachten besagter Ansicht; 

(c) Auswahlen einer Stelle auf besagter Ansicht fOr eine Verflnderung in der Zusammensetzung besagten 
MolekOls Oder der Anordnung einer Einheit; und 

(d) Herstellen eines Q-CSF-Analogs mit solch einer Anderung. 

4. ComputergestOtztes Veitahren zur Hersteilung eines Q-CSF-Analogs, welches die Schritte umlbSt: 

(a) Betrachten. auf dem Aminosaure- oder Atomniveau, der dreidimensionalen Struktur eines G-CSF-Mde- 
kOls. wie angegeben in Rg. 5. Qber einen Ck)mputer. wobei besagter Computer zuvor so programmiert worden 
ist. da3 er (0 die Koordinaten eines Q-CSF-MolekOls im dreidimensionalen Raum darsteitt und (II) die Eingabe 
von Information zur Veranderung besagter Q-CSF-Darstellung und Betrachtung derselben ennOglicht; 

(b) Auswahlen einer Stelle auf besagtem visuetlen Bild besagteri Q-CSF-MolekOls fQr eine Veranderung; 

(c) Eingeben der Information fOr besagte Veranderung in besagten Computer; 

(d) Betrachten einer dreklimensionalen Struldur besagten veranderten Q-CSF-MolekOls Qber besagten Com* 
puter; 

(e) fakultativ. Wiederholen der Schritte (a) - (e) oben; 

(f) Herstellen eines Q-CSF-Anatogs mit besagter Veranderung; und 

(g) fakultativ, Testen besagten G-CSF-Analogs auf eine gewOnschte Eigenschaft. 
Revendlcatlons 

1 . ProcM6 pour preparer un analogue de Q-CSF, oomprenant les stapes de : 

(a) visualiser au niveau atomique ou des acides amines des Informations fournissant la structure tridimension- 
netle tfune molecule de Q-CSF comme indiqu6 sur la figure 5. 

(b) cholsir k partir desdrtes informations visualis6es au moins un site sur ladite molecule de G-CSF pour alte- 
ration : 

(c) preparer une molecule de G-CSF ayant une telle alteration ; et 

(d) 6ventuellement, tester une telle molecule de G-CSF en ce qui concerne une caract6ri8tiqu6 souhaitee. 

2. Proc6d6 pour preparer un analogue de Q*CSF selon la revendication 1 . bas6 sur l*utilisation dun ordinateur, com- 
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prenantlesdtapesde: 

(a) Ibumir I'expression par ordinataur au niveau atomique ou das addea aminte da la etructure tridimension- 
nelle d'une moldcule da Q-C8F comma Indiqu6 aur la f igura 5. 
5 (b) choisir k partir da ladita axpreaaion par ordinataur au moins un aita aur ladita mol6cula da Q-CSF pour ait6- 

ration; 

(c) preparer una mol6cu!e da Q-CSF ayant una talle alteration ; at 

(d) 6ventuallemant. tastar una telta molteula da Q-CSF an oa qui oonoarna una caractdriatiqua aouhaitde. 
10 3; Proc6d6 pour preparer un analogue da Q-CSF aalon la revandtoation 2. oompranant : 

(a) munir ledit ordinateur dea nfx>yans pour afficher la structure tridimensionnelie d'une mol6cula de Q-CSF 
comme indiqud aur la figure 5 induant Taffichage de la composition dee fractions de ladite mol6cule de Q-CSF, 
en aff ichant de pr6f 6rence ramplaoament tridimensionnel de cheque adde amin6. at. plua pr6f 6rablament. en 

1$ africhamramplacememtridimenalonnaldechaqueatonned*unenu)l6^ 

(b) visualiaar ledit affichage ; 

(c) choisir un site aur ledit affichage pour alt6mtion da la composition de ladita moldcule ou da remplacament 
d'une fraction ; et 

(d) preparer un analogue de Q-CSF ayant une telle alteration. 

20 

4. Proc6d6 assiste par ordinateur pour preparer un analogue de Q-CSF. conprenant lea etapea de : 

(a) visualiser au niveau atomique ou des acides aminds la structure tridimensionnelie d'una molteula de Q- 
CSF comme indiqu6 sur la figure 5 via un ordinateur. ledit ordinateur ayant M prtolakslament programme (i) 
pour exprimer les coordonndes d'une moldcule de Q-CSF dans Tespaca tridimensionnel. et (ii) pour parmettre 
rentr6e dea informations pour reiteration de ladite e)q)re8sion de Q-CSF et sa visualisation ; 

(b) choiair un alte aur ladita tnwga visuelle de ladite molecule de Q-CSF pour alteration ; 

(c) entrer dea informationa pour ladite alteration dana ledit ordinataur ; 

(d) visualiser une structure tridintensionnalle da radite molecule de Q-CSF alteree via ledit ordinataur ; 

(e) repeter eventuellement les etapes (a) - (e) d-dessus ; 

(f) preparer un analogue de Q-CSF ayant ladite alteration ; et 

(g) tester eventuellement ledit analogue de Q-CSF an ce qui oonceme une caracteriatique aouhaitea. 
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Met Thr Pro Leu Glv Prr Ala 
TCTA-GAAAAAACCAAGCJAGGTAATAAArA ATG ACT CCA TTA G3T CC7 ZZl 

Ser Ser Leu Pre Gin Ser Phe Leu Leu Lys Cys Leu Glv Gin 
TC7 TCT CTG CCS CAA AGC T7T C7G C7G AAA 7G7 C7G 6AA CAG 

Val Ar? Lys lie Gin Sly Asp Gly Ala Ala L«u Gin Glu Lys Leu 
GT7 CGT AAA A7C CAG GGT GAC GS7 GC7 GCA CTG CAA 6AA AAA CTG 

Cys Ala Thr Tyr Lys Leu Cys His Pro Glu' Glu Leu Val Leu Leu 
TGC GCT ACT TAC AAA CTG TGC CAT CCG GAA GAG CTG GTA CTG CTG 

Gly His Ser Leu Gly lie Pro Trp Ala Pro Leu Ser Ser Cys Pro 
GGT CAT TCT CTT GGG ATC CCG TGG GCT CCG CTG TCT TCT TGT CCA 

Ser Gin Ala Leu Gin Leu Ala Gly Cys Leu Ser Gin Leu His Ser 
TCT CAA GCT CTT CAG CTG GCT GGT TGT CTG TCT CAA CTG CAT TCT 

Gly Leu Phe Leu Tyr Gin Gly Leu Leu Gin Ala Leu Glu Gly lie 
GGT CTG TTC CTG TAT CAG GGT CTT CTG CAA GCT CTG GAA GGT ATC 

Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gin Leu Asp Val 
TCT CCG GAA CTG GGT CCG ACT CTG GAC ACT CTG CAG CTA GAT GTA 

Ala Asp Phe Ala Thr Thr lie Trp Gin Gin Met Glu Glu Leu Gly 
GCT GAC TTT GCT ACT ACT ATT TGG CAA CAG ATG GAA GAG CTC GGT 

Met Ala Pro Ala Leu Gin Pro Thr Gin Gly Ala Met Pro Ala Phe 
ATG GCA CCA GCT CTG CAA CCG ACT CAA GGT GCT ATG CCG GCA TTC 

Ala Ser Ala Phe Gin Arg Arg Ala Gly Gly Val Leu Val Ala Ser 
GCT TCT GCA TTC CAG CGT CGT GCA GGA GGT GTA CTG 6TT GCT TCT 

His Leu Gin Ser Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His 
CAT CTG CAA TCT TTC CTG GAA GTA TCT TAC CGT GTT CTG CGT CAT 

Leu Ala Gin Pro OC AM 

CTG GCT CAG CCG TAA TAG AATTC 
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FIGURE 2 
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FIGURE 4 
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FIGURE 6 



